Code Ramblings Of A Madman C++ and other adventures...

C++17 and Memory Mapped IO

In some ways this is a Part One of this subject, largely because my IO subsystem isn't in any way finished and I have literally just put something together so that I can get things loaded in to my test framework, but the basic idea I have here is one I'll probably base a few things on so it is worth quickly writing about.

Loading Data

As we know unless you are going fully hard coded procedural with your project at some point you are going to need to load things. It is a pretty fundamental operation but one with an array of solutions in the C++ world.

The two I have been toying with, trying to make up my mind between, have been Async IO (likely via IOCP on Windows) or Memory Mapped IO.

I've done some experiments in the past with the former, hooking up IOCP callbacks to a task system based around Intel's Threading Building Blocks and it certainly works well but I'm not sure it is the right fit; while I'm interested in being able to stream things in an async manner other solutions could well exist for the async part of the problem when coupled with another IO solution.

Which brings us to memory mapped IO, which in some ways is the fundamental IO system for Windows, being built upon (and a part of) the virtual memory subsystem. While not async, and risking stalling a thread due to page faults, it does bring with it the useful ability to be able to open views in to an already open file, perfect for directly reading from an archive for example.

A Mapping we will go

Memory mapped IO on Windows is also pretty simple;

  1. Open target file
  2. Create a file mapping
  3. Create a view in to that file mapping
  4. Use the returned pointer

Then, when you are done, you unmap the view and close the two handles referencing the opened file mapping and file you want to use. (If you were doing archive-like access then you might not do the latter two steps until program end however.)

The code itself is pretty simple, certainly if we want to open a full file for mapping;

char * OpenFile(const std::wstring &filename)
{
    HANDLE fileHandle = ::CreateFile(filename.c_str(), GENERIC_READ,
                    FILE_SHARE_READ | FILE_SHARE_WRITE, 0,
                    OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);

    if (fileHandle == INVALID_HANDLE_VALUE) 
        return nullptr;

    int fileSize = query_file_size(fileHandle);
    if (fileSize <= 0) 
        return nullptr;

    HANDLE fileMappingHandle = ::CreateFileMapping(fileHandle, 0, PAGE_READONLY, 0, 0, 0);

    if (fileMappingHandle == INVALID_HANDLE_VALUE)
        return nullptr;

    char* data = static_cast<char*>(::MapViewOfFile(fileMappingHandle, FILE_MAP_READ, 0, 0, fileSize));

    if (!data) 
        return nullptr;

    return data;
}

It gets a bit more complicated if you want offsets and the like, however for our initial purposes it will do.

Enter C++

There is, of course, an obvious problem with the above; no clean up code and all you get back is a char * which doesn't really help; at best you can undo the mapping of the file but those handles are lost to the ether.

So what can we do?

One approach would be to wrap the data in an object and have it automatically clean up for us; so our function returns an instance of that class with the associated destructor and access functions.

class DirectFileHandle
{
    public:
        DirectFileHandle(char * data, HANDLE file, HANDLE mapping) : data(data), file(file), mapping(mapping) {};
        ~DirectFileHandle() 
        {
            ::UnmapViewOfFile(data);
            ::CloseHandle(fileMappingHandle);
            ::CloseHandle(fileHandle);
        }
        char * getData() { return data; }
        // default copy and move functions
        // plus declarations for holding the data and two handle pointers
}

Not a complete class, but you get the idea I'm sure.

However that's a lot of work, plus the introduction of a type, in order to just track data and clean up.

Is there something easier we can do?

std::unique_ptr to the rescue!

As mentioned all we are really doing is holding a pointer and, when it dies, needing to clean up some state which we don't really need access to any more.

Fortunately in std::unique_ptr we have a class designed to do just that; clean up state when it goes out of scope. We can even provide it with a custom deletion function to do the clean up for us.

So what does our new type look like?
std::unique_ptr<char, std::function<void(char*)>>;

As before the primary payload is the char* but we directly associate that with a clean up function which will be called when the unique_ptr goes out of scope.

From there it is a simple matter of changing our function's signature to return that type and update our final return statement with my new favourite C++ syntax;

return{ data, [=](char* handle) 
              {
                  ::UnmapViewOfFile(handle);
                  ::CloseHandle(fileMappingHandle);
                  ::CloseHandle(fileHandle);
                  return;
              }
      };

As with the gate io login code from the previous entry we don't need to state the type here as the compiler already knows it.

The capture-by-copy default of the lambda ensures we have a copy of the handle objects come clean up time and the address of the data is supplied via the call back.

But what about those error cases? In those cases we change the returns to be return{nullptr, [](char*) { return; }}; effectively returning a null pointer.

The usage so far

A quick example of this in usage can be taken from my test program, which I'm using to test and build up functionality as I go;

int APIENTRY wWinMain(_In_ HINSTANCE hInstance,
                 _In_opt_ HINSTANCE hPrevInstance,
                 _In_ LPWSTR    lpCmdLine,
                 _In_ int       nCmdShow)
{

    sol::state luaState;
    luaState.open_libraries(sol::lib::base, sol::lib::package);
    luaState.create_named_table("Bonsai");        // table for all the Bonsai stuff
    Bonsai::Windowing::Initialise(luaState);

    using DirectFileHandle = Bonsai::File::DirectFile::DirectFileHandle;

    DirectFileHandle data = Bonsai::File::DirectFile::OpenFile(L"startup.lua");
    luaState.script(data.get());

    std::function<bool ()> updateFunc = luaState["Update"];

    while (updateFunc())
    {
        Sleep(0);
    }

    return 0;
}

A few custom libraries in there, however the key element is dead centre with the file open function and the usage of the returned value on the next line to feed the Lua script wrapper.1

The Problem

There is, however, a slight issue with the interface; we have no idea of the size of data being returned.

Now, in this case it isn't a problem; one of the nice function of memory mapped files on Windows (at least) is that due to the security in the kernel memory pages returned by the OS to user space get zero initialised. In this case we can see that by catching things in a debugger and then looking at the memory pointed at by data.get() which is, as expected, the file content followed by a bunch of nulls filling the rest of the memory page.

Given that setup we are good when we are loading in string based data but what if we need something else or simply want the size?

At this point it is temping to throw in the towel and head back towards a class, but a simpler option does exist; our old friend std::pair which in this case will let us pair a size_t with the pointer to the data handler.

The Solution

So, first of all we need to perform some type changes;

using FileDataPointer = std::unique_ptr<char, std::function<void(char*)>>;
using DirectFileHandle = std::pair<size_t, FileDataPointer>;

What was our DirectFileHandle before now becomes FileDataPointer and DirectFileHandle is now a pair with the data we require. Right now I've decided to order it as 'size' and 'pointer' but it could just as easily be the reverse of that.

After that we need to make some changes to our function;

FILEIO_API DirectFileHandle OpenFile(const std::wstring &filename)
{
    // as before
    if (fileHandle == INVALID_HANDLE_VALUE)
        return{ 0, FileDataPointer{ nullptr, [](char*) {return; } } };

The function signature itself doesn't need to change thanks to our redefining of our alias, however the return types declared in the code do.

Previously we could just directly construct the std::unique_ptr and the compiler would just figure it out, however if we try that with the new type it seems to change deduction rules and we get errors;

return{ 0, {nullptr, [](char*) {return; } } };
// The above results in the error below from MSVC in VS17

error C2440: 'return': cannot convert from 'initializer list' to 'std::pair<::size_t,Bonsai::File::DirectFile::FileDataPointer>'
note: No constructor could take the source type, or constructor overload resolution was ambiguous

The compiler has decided what we have supplied it with is an initialiser list and as such tries to find a constructor to convert, but as none existed it produced an error.
(I believe this is a legitimate problem and not a case of 'early compiler syndrome')

So, we have to supply the type of the std::unqiue_ptr in order to sort the types out. This change is repeated all down the function at the various return points, including the final one, where the main difference at that point is that we return the real file size and not the 0 place holder.

After that we need to make a change to the usage site as now we have a pair being returned and not a wrapped pointer to use;

auto data = Bonsai::File::DirectFile::OpenFile(L"startup.lua"); 
luaState.script(data.second.get());

In this case nothing much changes but we now have the size around if we need it.

But we aren't quite done...

Now, if we had a full C++17 compiler to use we could make use of one final thing; structured bindings

Structured bindings give us syntax to unpack return values in to separate variables; we can do something like this already with std::tie2 but structured variables allow us to both declare the assign at the same time.

// C++14 way
using FileDataPointer = Bonsai::File::DirectFile::FileDataPointer;
// Declare the variables up front
FileDataPointer ptr;
size_t size;
// Now unpack the return value
std::tie(size, ptr) = Bonsai::File::DirectFile::OpenFile(L"startup.lua");
luaState.script(ptr.get());

// C++17 way
// Declare and define at the same time
auto [size, ptr] = Bonsai::File::DirectFile::OpenFile(L"startup.lua");
luaState.script(ptr.get());

That, however, is for a future compiler update; for now we can stick to the std::pair method which at least allows us to switch to the C++17 syntax as and when the compilers can handle it.

Summing up

In a real setup you would check for nulls before using however this code demonstrates the principle nicely I feel.
(I also know it works as I made a slight error on the first run where my lambda captured by reference, meaning at callback time I got a nice crash during shutdown as the handle it was trying to reference was no longer valid.)

So there we have it, a simple C++17 based Memory Mapped File IO solution - I'll be building on this over time in order to build something a bit more complex, but as a proof of concept it works well.


Window Message Dispatch - C++17 refactor

I've basically really bad at working on my own projects, but with the recent release of Visual Studio 2017 RC and its improved C++17 support I figured it was time to crack on again...

Simple Changes First

To that end I've spent a bit of time today updating my own basic windowing library to use C++17 features. Some of the things have been simple transforms such as converting typedef to using, others have been more OCD satisfying;

// This ...
namespace winprops
{
    enum winprops_enum
    {
    fullscreen = 0,
    windowed
    };
}
typedef winprops::winprops_enum WindowProperties;

// ... becomes this ...
enum class WindowProperties
{
    fullscreen = 0,
    windowed
};

How Thing Were

The biggest change however, and the one which makes me pretty happy, was in the core message handler which hasn't been really updated since I wrote it back in 2003 or so.

The old loop looked like this;

LRESULT CALLBACK WindowMessageRouter::MsgRouter(HWND hwnd, UINT message, WPARAM wparam, LPARAM lparam)
{
    // attempt to retrieve internal Window handle
    WinHnd wnd = ::GetWindowLongPtr(hwnd, GWLP_USERDATA);

    WindowMap::iterator it = s_WindowMap->find(wnd);
    if (it != s_WindowMap->end())
    {
    // First see if we have a user message handler for this message
        UserMessageHandler userhandler;
        WindowMessageData msgdata;
        bool hasHandler = false;

        switch (message)
        {
        case WM_CLOSE:
            hasHandler = it->second->GetUserMessageHandler(winmsgs::closemsg, userhandler);
            msgdata.msg = winmsgs::closemsg;
            break;
        case WM_DESTROY:
            hasHandler = it->second->GetUserMessageHandler(winmsgs::destorymsg, userhandler);
            msgdata.msg = winmsgs::destorymsg;
            break;
        case WM_SIZE:
            hasHandler = it->second->GetUserMessageHandler(winmsgs::sizemsg, userhandler);
            msgdata.msg = winmsgs::sizemsg;
            msgdata.param1 = LOWORD(lparam);    // width
            msgdata.param2 = HIWORD(lparam);    // height
            break;
        case WM_ACTIVATE:
            hasHandler = it->second->GetUserMessageHandler(winmsgs::activemsg, userhandler);
            msgdata.msg = winmsgs::activemsg;
            msgdata.param1 = !HIWORD(wparam) ? true : false;
            break;
        case WM_MOVE:
            hasHandler = it->second->GetUserMessageHandler(winmsgs::movemsg, userhandler);
            msgdata.msg = winmsgs::movemsg;
            msgdata.param1 = LOWORD(lparam);
            msgdata.param2 = HIWORD(lparam);
            break;
        default:
        break;
    }

    if (hasHandler)
    {
        if (userhandler(wnd, msgdata))
        {
            return TRUE;
        }
    }

    MessageHandler handler;
    hasHandler = it->second->GetMessageHandler(message, handler);
    if (hasHandler)
    {
        return handler(*(it->second), wparam, lparam);
    }
    else if (message == WM_NCCREATE)
    {
        // attempt to store internal Window handle
        wnd = (WinHnd)((LPCREATESTRUCT)lparam)->lpCreateParams;
        ::SetWindowLongPtr(hwnd, GWLP_USERDATA, wnd);
        return TRUE;
    }
    return DefWindowProc(hwnd, message, wparam, lparam);
}

The code is pretty simple;

  • See if we know how to handle a window we've got a message for (previous setup)
  • If so then go and look for a user handler and translate message data across
  • If we have a handler then execute it
  • If we didn't have user handler then try a system one

The final 'else if' section deals with newly created windows and setting up the map.

So this work, and works well, the pattern is pretty common in C++ code from back in the early-2000s but it is a bit... repeaty.

The problem comes from C++ support and general 'good practise' back in the day; but life moves on so lets make some changes.

The Message Handler Look up

The first problem is the query setup, for which the function which performs the "'do you have a handler?" look up was like this;

bool Window::GetMessageHandler(oswinmsg message, MessageHandler &handler)
{
    MessageIterator it = messagemap.find(message);
    bool found = it != messagemap.end();
    if(found)
    {
        handler = it->second;
    }
    return found;
}

As code goes this isn't hard;

  • We check to see if we have a message handler
  • If we do then we store it in the supplied reference
  • Then we return if we found it or not

Not bad, but it is taking us 5 lines of code (7 if you include the braces) and if you think about it we should be able to test for the existence of the handler by querying the handler object itself rather than storing, in the calling function, what is going on. Along with that the handler gets default constructed on the calling side, which might be a waste too.

So what can C++17 do to help us?
Enter std::optional<T>.

std::optional<T> lets us return an object which is either null or contains an instance of the object of the given type. Later we can look to see if it is valid (via operator bool()) before tying to use it - doesn't that sound somewhat like what was described just now?

So, with a quick refactor the message handler lookup function becomes;

std::optional<MessageHandler>  Window::GetMessageHandler(oswinmsg message)
{
    MessageIterator it = messagemap.find(message);
    return it != messagemap.end() ? it->second : std::optional<MessageHandler>{};
}

Isn't that much better?

Instead of having to pass in a thing and then return effectively two things (via the ref and the bool return) we now return one thing which either contains the handler object or a null constructed object.
(I believe if I had written this as an 'if...else' statement that the return could simply have been {} for the 'else' path but the ternary operator messes that up somewhat, at least in the VS17 RC compiler anyway.)

So, with that transform in place our handling code can now change a bit too; the simple transform at this point would be to replace that bool with a direct assignment to the handler object;

UserMessageHandler userhandler;
WindowMessageData msgdata;
switch(message)
{
case WM_CLOSE:
    userhandler = it->second->GetUserMessageHandler(winmsgs::closemsg);
    msgdata.msg = winmsgs::closemsg;
    break;
// ... blah blah ..

But we still have a default constructed object kicking about, not to mention the second data structure for the message data (ok, so it is basically 3 ints, but still...) - so can we change this?

The answer is yes, changes can be made with the introduction of a lambda and a std::pair.

Enter The Lamdba And The Pair

The std::pair is the easy one to explain. When you look at the message handling code what you get is an implied coupling between the message handler and the data that goes with it; a transformed version of the original message handler data.

So, instead of having the two separate we can couple them properly;

// so this...
UserMessageHandler userhandler;
WindowMessageData msgdata;

// becomes this...
using UserMessageHandlerData = std::pair<UserMessageHandler, WindowMessageData>;

OK, so how does that help us?

Well, on its on it doesn't really however this is where the lambda enters the equation; one of the things you can do with a lambda is declare it and execute at the same type, effectively becoming an anonymous initialisation function at local scope. It is something which, I admit, didn't occur to me until I watched gate.io login from CppCon2016.

So, with that in mind how do we make the change?
Well, the (near) final code looks like this;

auto userMessageData = [window = it->second, message, wparam, lparam]()
    {
        WindowMessageData msgdata;
    switch (message)
    {
    case WM_CLOSE:
        msgdata.msg = winmsg::closemsg;
        return std::make_pair(window->GetUserMessageHandler(winmsg::closemsg), msgdata );
        break;
    case WM_DESTROY:
        msgdata.msg = winmsg::destorymsg;
        return std::make_pair(window->GetUserMessageHandler(winmsg::destorymsg), msgdata);
        break;
    case WM_SIZE:
        msgdata.msg = winmsg::sizemsg;
        msgdata.param1 = LOWORD(lparam);    // width
        msgdata.param2 = HIWORD(lparam);    // height
        return std::make_pair(window->GetUserMessageHandler(winmsg::sizemsg), msgdata);
        break;
        // a couple of cases missing...
    default:
        break;
    }
    return std::make_pair(std::optional<UserMessageHandler>{}, msgdata);
    }();

if (userMessageData.first)
{
    if (userMessageData.first.value()(wnd, userMessageData.second))
    {
        return TRUE;
    }
}

So a bit of a change, the overall function this is in is now also a bit shorter.

Basically we define a lambda which return a std::pair as defined before, using std::make_pair to construct our pair to return - if we don't understand the message then we simply construct a pair with two null constructed types and return that instead.

Note the end of the lambda where, after the closing brace you'll find a pair of parentheses which invokes the lambda there and then, assigning the values to userMessageData.

After that we simply check the first item in the pair and dispatch if needs be.
So we are done right?

Well, as noted this is nearly the final solution it suffers from a couple of problems;

  1. Lots and lots of repeating - we have make pair all over the place and we have to specify the types in the default return statement
  2. We are still default constructing that WindowMessageData type and assign values after trivial transforms.
  3. That ugly call syntax... ugh...

So lets fix that!

You have a definitive type

The first has a pretty easy fix; tell the lambda what it will return so the compiler can just sort that out for you;

auto userMessageData = [window = it->second, message, wparam, lparam]() -> std::pair<std::optional<UserMessageHandler>, WindowMessageData>
{
    switch (message)
    {
    case WM_CLOSE:
        return{ window->GetUserMessageHandler(winmsg::closemsg), { winmsg::closemsg, 0, 0 } };
        break;
    case WM_DESTROY:
        return{ window->GetUserMessageHandler(winmsg::destroymsg), { winmsg::destroymsg, 0, 0 } };
        break;
    case WM_SIZE: 
        return{ window->GetUserMessageHandler(winmsg::sizemsg), { winmsg::sizemsg, LOWORD(lparam), HIWORD(lparam) } };
        break;
    case WM_ACTIVATE:
        return{ window->GetUserMessageHandler(winmsg::activemsg), { winmsg::activemsg, !HIWORD(wparam) ? true : false } };
        break;
    case WM_MOVE:
        return{ window->GetUserMessageHandler(winmsg::movemsg), { winmsg::movemsg, LOWORD(lparam), HIWORD(lparam) } };
        break;
    default:
        break;
    }
    return{ {}, {} };
}();

How much shorter is that?

So, as noted the first change happens at the top; we now tell the lambda what it will be returning - the compiler can now use that information to reason about the rest of the code.

Now, because we know the type, and we are using C++17, we can kiss goodbye to std::make_pair; instead we use the brace construction syntax to directly create the pair, and the data for the second object, at the return point - because the compiler knows what to return it knows what to construct and return and that goes directly in to our userMessageData variable, which has the correct type setup via auto.

One of the fun side effects of this is that last line of the lambda; return { {}, {} }
Once again, because the compiler knows the type we can just tell it "construct me a pair of two default constructed objects - you know the types, don't bother me with the details".

And just like that all our duplication goes away and we get a nice compact message handler.
Points 1 and 2 handled.

So what about point 3?

Hiding The Call Away

In this case we can take advantage of Variadic Templates, std::invoke and parameter packs to create an invoking function to wrap things away;

template<typename T, typename... Args>
bool invokeOptional(T callable, Args&&... args)
{
    return std::invoke(callable.value(), args...);
}

This simple wrapper just takes the optional type (it could probably do with some form of protection to make sure it is an optional which can be invoked), extracts the value and passes it down to std::invoke to do the calling.

The variadic templates and parameter pack allows us to pass any combination of parameters down and, as long as the type held by optional can be called with it, invoke the function as we need - this means one function for both the user and system call backs;

if (userMessageData.first)
{
    if (invokeOptional(userMessageData.first, wnd, userMessageData.second))
    {
        return TRUE;
    }
}

auto handler = it->second->GetMessageHandler(message);
if (handler)
{
    return invokeOptional(handler, (*(it->second)), wparam, lparam);
}

And there we have it, much refactoring later something more C++17 than C++03.

So here we have it, the message router code in its final current form.

namespace Bonsai::Windowing  // an underrated new feature...
{
    template<typename T, typename... Args>
    bool invokeOptional(T callable, Args&&... args)
    {
        static_assert(std::is_convertible<T, std::optional<T::value_type> >::value);
        return std::invoke(callable.value(), args...);
    }

    WindowMap *WindowMessageRouter::s_WindowMap;

    WindowMessageRouter::WindowMessageRouter(WindowMap &windowmap)
    {
        s_WindowMap = &windowmap;
    }
    WindowMessageRouter::~WindowMessageRouter()
    {
    }

    bool WindowMessageRouter::Dispatch(void)
    {
        static MSG msg;
        int gmsg = 0;

        if (::PeekMessage(&msg, 0, 0, 0, PM_REMOVE))
        {
            ::TranslateMessage(&msg);
            ::DispatchMessage(&msg);
        }

        if (msg.message == WM_QUIT)
            return false;

        return true;
    }

    LRESULT CALLBACK WindowMessageRouter::MsgRouter(HWND hwnd, UINT message, WPARAM wparam, LPARAM lparam)
    {
        // attempt to retrieve internal Window handle
        WinHnd wnd = ::GetWindowLongPtr(hwnd, GWLP_USERDATA);

        WindowMap::iterator it = s_WindowMap->find(wnd);
        if (it != s_WindowMap->end())
        {
            // First see if we have a user message handler for this message    
            auto userMessageData = [window = it->second, message, wparam, lparam]() -> std::pair<std::optional<UserMessageHandler>, WindowMessageData>
            {
                switch (message)
                {
                case WM_CLOSE:
                    return{ window->GetUserMessageHandler(winmsg::closemsg), { winmsg::closemsg, 0, 0 } };
                    break;
                case WM_DESTROY:
                    return{ window->GetUserMessageHandler(winmsg::destroymsg), { winmsg::destroymsg, 0, 0 } };
                    break;
                case WM_SIZE: 
                    return{ window->GetUserMessageHandler(winmsg::sizemsg), { winmsg::sizemsg, LOWORD(lparam), HIWORD(lparam) } };
                    break;
                case WM_ACTIVATE:
                    return{ window->GetUserMessageHandler(winmsg::activemsg), { winmsg::activemsg, !HIWORD(wparam) ? true : false } };
                    break;
                case WM_MOVE:
                    return{ window->GetUserMessageHandler(winmsg::movemsg), { winmsg::movemsg, LOWORD(lparam), HIWORD(lparam) } };
                    break;
                default:
                    break;
                }
                return{ {}, {} };
            }();

            if (userMessageData.first)
            {
                if (invokeOptional(userMessageData.first, wnd, userMessageData.second))
                {
                    return TRUE;
                }
            }

            auto handler = it->second->GetMessageHandler(message);
            if (handler)
            {
                return invokeOptional(handler, (*(it->second)), wparam, lparam);
            }
        }
        else if (message == WM_NCCREATE)
        {
            // attempt to store internal Window handle
            wnd = (WinHnd)((LPCREATESTRUCT)lparam)->lpCreateParams;
            ::SetWindowLongPtr(hwnd, GWLP_USERDATA, wnd);
            return TRUE;
        }
        return DefWindowProc(hwnd, message, wparam, lparam);
    }
}

The Begining.. again

Welcome to my latest attempt at running some kind of code related blog.

It has been a few years but hopefully this recent flurry of activity on my part will resolve in to something productive and coherent on my part.

Largely I'm planning to blog about C++ and graphics related (likely D3D12 and/or Vulkan, maybe with some Metal thrown in for kicks) while I slowly develop my own framework and game.

I might also revisit some older projects from a time long before this blog and update and waffle about them.

Lets see how this pans out...