Immediate Mode GUI as State Monad

June 29, 2024

Having been toying with trying to make GUI applications for some time now I recently stumbled on the concept of Immediate Mode GUIs (IMGUI for short), this is contrasted with the more standard view of retained mode GUI. Having also an interest in functional programming and how user interaction works there I was wondering how IMGUI would look like in a more pure world.

ImGUI, retained mode, State monad, what are you talking about?

For all of these there are much better resources on the internet but the short version is as follows.

Retained mode GUIs are the classical GUIs we are all familiar with, an example in Qt might look like:

#include <qapplication.h>
#include <qpushbutton.h>


int main( int argc, char **argv ){
    QApplication a( argc, argv );

    QPushButton hello( "Hello world!", 0 );
    hello.resize( 100, 30 );

    a.setMainWidget( &hello );
    hello.show();
    return a.exec();
}

As can be seen you first build the interface and then run it, the only way for it to interact is via callbacks (or signal and slots in the case of Qt). If you want dynamic interfaces you will need to hold on to a reference or have another way to find back the element (e.g. HTMLs id tag). This means that you as a programmer have to hold onto the GUI state as well as your application state, and keep those states in sync.

In contrast ImGUI code looks more like this (example based on Dear ImGUI, shortened because like most ImGUI implementations it is not standalone and needs support of a platform library like glfw or SDL2):

#include "imgui.h"

// skipping platform includes

int main(int, char**)
{
	//skipping platform setup

	bool checkbox = false;

	while(!platform_quit())
	{
		//platform interaction e.g. get user input

		ImGui::Begin("Hello!");
		ImGui::CheckBox("Hello World", &checkbox)
		if(checkbox)
			ImGui::Text("Hi there!");
		ImGui::End()

		//render ImGui via platform
	}

	//skipping platform breakdown

	return 0;
}

Even though this example is longer a large part of this is due the missing platform support that is included in something like Qt or GTK, also this example already does more (the text will appear or disappear depending on the state of the checkbox, while the Qt example does literally nothing). Another difference is that in the immediate mode example we are responsible for the event loop while that is build into the Qt example. The biggest drawback of immediate mode GUIs is that we render the full GUI every frame this is no problem on modern GPUs but might be a problem on older hardware (though with some clever programming it might be possible to make it perform well enough)

And finally State Monad, for this document I am not going to what a monad is exactly (whole term papers have been written about that), for now it is enough to know that it is a way to do certain imperative things (like ordering things or dealing with IO) in a functionally pure way. In particular the State Monad allows is to deal with as it says state (albeit encapsulated and only available in the right context). The basic data type definition in Haskell looks something like this:

newtype State s a = State {runState :: (s -> (a,s)) }

In essence a State is nothing else but a function that takes an initial state (s) and returns a new state and something else (a), due to it being a monad we can use do notation to clean up a lot of the tedious treading of the state trough different function calls. It also tends to come with a couple of convenience functions to get and put state.

Going deeper into ImGUI

From the above it might not seem like it but there is actually state behind the ImGui this becomes a bit clearer when we compare the declaration of Dear ImGui for a button with another imgui library (in this case the C based Nuklear), in Dear ImGui it is:

bool ImGui::Button(char* label){
	//[...]
}

While in Nuklear it is:

int nk_button_label(struct nk_context* ctx, char* label){
	//[...]
}

The difference is clear and in Dear ImGui the state is just hidden (but still there), so what is this state used for? In the most basic case it holds which element was active in the previous frame and any user interactions (e.g. if a mouse button was pressed down and where the mouse pointer is), in more complex it can also hold layout, drawing command buffers, and more! (Like it does in both Dear ImGui and Nuklear) But unless you are making your own widgets it is not necessary to deal with this state in any way.

Note that in both cases the label is necessary and needs to be unique, the reason for this it is used to track widget between drawing frame refreshes.

It is already interesting that the Nuklear example can already be written in a more pure form so long as we are allowed to return more than one thing at a time (luckily most if not all functional languages allow this) in Haskell it might look something like this:

nk_button :: String -> Context -> (Bool, Context)
nk_button label context = -- ...
  -- something that uses context to determine if it should return true or false and also update context with drawing commands

Now that does look familiar doesn’t it?

Pure ImGui

So now that we have the basics lets put it all together, first lets create some definitions:

data Context = Context
	{ -- ... define our context, which will include a list of draw commands
	, quit :: Bool -- Should we quit?
	, draw :: [Drawable] -- Drawable is enum of things like Rect (x,y, width, height), Circle(x,y,radius), etc.
	-- There is probably a better data structure for draw but for this discussion a list will do
	}

newtype Label = Label String
newtype ImGui a = ImGui State Context a

We added a label type to show this is something special and not “just” a string. With this we can re-write our button to be as follows:

button :: Label -> ImGui Bool
button label = do
	ctx <- get
	-- ...
	-- Do something with the context like determining if we are clicked and adding to the draw list
	put(newCtx)
	return clicked

Interestingly this looks a lot like the Dear ImGui example, though the exact insides are probably quite different. So how do we use this? Well first we need to define some helper functions, some of these will run in IO!

-- init ctx
initCtx :: Context
initCtx = Context {
	-- ...
	, quit = false
	, draw = []
	}

-- getInput gets input from user and updates an old context with the new state
input :: Context -> IO Context
getInput oldCtx = do
	--- ... interact with the system to get user IO
	return newCtx

-- render renders out the draw commands stored in the draw list
render :: Context -> IO ()
render ctx = do
	--- render, render

-- clear clear draw list and potentially remove stale elements from the context
clear :: Context -> Context
clear ctx = -- ... at least ctx {draw = []}

-- Combine clearing and rendering in one function
renderClear :: Context -> IO Context
renderClear ctx = do
	render ctx
	return (clear ctx)

-- close determine from context if application needs to close
close :: Context -> Bool
close a = a.quit

So we have a function to generate an initial context, a function to get input, as well as one to render the drawables, a way to clean up after ourselves, and finally a check to see if we need to quit. So putting this all together we could write something like this:

runImGui :: model -> Context -> (model -> model) -> (model -> ImGui model) -> IO ()
runImGui model ctx update gui = do
	iCtx <- input ctx
	let (guiModel, guiCtx) = runState (gui model) iCtx
	newCtx <- renderClear guiCtx
	let newModel = update guiModel
	if close newCtx then
		return ()
	else
		runImGui newModel newCtx update gui

This will thread our application data (called model here) through our rendering function and make it loop until we want to quit. So now we can make a program that does the same as our ImGui C++ example.

ourGui :: Bool -> ImGui Bool
ourGui checked = do
	beginWindow "Hello!" -- starts a window, stored in our context of course
	newChecked <- checkbox "Hello World!" checked -- Draws checkbox takes a boolean to know if it needs to be drawn checked or unchecked
	if newChecked
		then guiText "Hi there!"
		else return ()
	endWindow -- ends window content
	return newChecked

main = runImGui False initCtx id ourGui

In practice there would be some extra boiler plate to setup a platform window with something to draw on, also we probably want to interface with the system somehow. There are a couple of ways to do that the simplest is to turn the update function to runImGui from a model -> model into a model -> IO model, this does require some other minimal changes (like changing the call to id in main to return) but is otherwise pretty simple and powerful (since it gives is full access to the IO monad). Another option might be to turn the update function into the following Maybe a -> model -> (model, Cmd a) this require some more modifications to both runImgui and main but does allow for constraining allowed operations to a subset of the full IO space (of course include some kind of ’no IO currently needed’). The second form also allows is to implement something very much like the Elm Architecture

The Elm architecture?

If you ever looked at web development and pure languages you probably have stumbled into Elm a pure language for front end development that transpiles to javascript. One of the things it does is that is has a particular way to draw the html and interact with the user. It does so by having the following functions (here only shown as their type declarations)

type Model = -- ... Application state
type Msg = -- ... Messages that can be generated

init : () -> (Model, Cmd Msg)
update : Msg -> Model -> (Model, Cmd Msg)
view : Model -> Html Msg

Model and Msg are types created by the user, init generates an initial Model and optionally (due to Cmd.none being available) an initial command in the form of a Cmd Msg, update takes in a Msg and a Model to generate a new Model with optionally a new command and finally the view function turns this all into html. When interacted the generated code then send the Msg via the runtime to the update function. This also happens for commands albeit these are send directly to the runtime where those then generate a new Msg.

As can be seen the update function is already quite close to our proposed update function above and also the our ourGui function needs to look a bit different. To make things a bit easier it helps to change the widget functions a bit note that instead of returning only when a user interacts with we are running every frame so instead of returning a Msg we are going to return a Maybe Msg (If really important we can wrap this up so it gets mostly hidden from our users e.g. by doing newtype TeaImGui a = ImGui (Maybe a) or something like that)

-- Since button is only true if a user interacted the changes are pretty small
teaButton :: Label -> msg -> ImGui Maybe msg
teaButton label msg = do
	if button label
		then return Just msg
		else return Nothing

-- this needs a full reimplentation to check if the user actually interacted with the checbox since last frame
teaChecbox :: Label -> Bool -> (Bool -> msg) -> ImGui Maybe msg
teaCheckbox label model msg = do
	ctx <- get
	-- ... check if interacted and if so set newmodel to inverser of model
	-- ... also don't forget to draw!
	put(ctx)
	if interacted
		then return Just msg newModel
		else return Nothing

We can also write some convenience functions so a list of these ImGui Maybe msg types can be chained whit in the end returning either Nothing or a Just msg depending on if there was any user interaction.

teaWindow :: Label -> [ImGui Maybe msg] -> Imgui Maybe msg
teaWindow label list = let
		select a b = do -- (ImGui Maybe msg) -> (ImGui Maybe msg) -> ImGui Maybe msg
			x <- a
			y <- b
			if x == Nothing
				then if y == Nothing
					then return Nothing
					else return y
				else return x
	in do
		beginWindow label
		let final = foldl select  (return None) list
		endWindow
		final

It is now possible to rewrite our runImGui function as follows (to simplify things we are going to ignore the Cmd msg for now and leave that as an exercise to the reader)

runTeaImGui :: model -> Context -> (model -> msg -> model) -> (model -> Imgui Maybe msg) -> IO ()
runTeaImGui model ctx update view = do
	iCtx <- input ctx
	let (guiMsg, guiCtx) = runState (view model) iCtx
	newCtx <- renderClear guiCtx
	let maybeModel = fmap (update model) guiMsg
	if close newCtx
		then return ()
		else case maybeModel of
			Just x = return runTeaImGui x newCtx update view
			Nothing = return runTeaImGui model newCtx update view

As we can see above we only call update is there is actually a message, if there isn’t fmap will short circuit and thus our update function doesn’t have to care about getting a maybe message. Something similar can be done with a Cmd msg provided there is a function that turns a Cmd msg into a IO Maybe msg.¹ Finally we can rewrite our example as follows:

newtype Model = Bool
data Msg = Msg Bool

ourTeaGui :: Bool -> ImGui Maybe Msg
ourTeaGui checked = let
		item = (teaChecbox "Hello World!" checked Msg)
		content = if checked
			then [item, teaText ("Hello there!")]
			else [item]
	in
		teaWindow "Hello!" content


update :: Msg -> Model -> Model
update (Msg x) _ = x --Since we want to update the model to the new state of the checkbox we can ignore the old state

main = runTeaImGui False initCtx update ourTeaGui

As can be seen this can start to look fairly declarative even though underneath it is still the state monad, but users only really need to deal with that if they want to write their own widgets otherwise it has become completely possible to ignore this fact.

Conclusion

There are a fair bit of things that can still be added either for convenience, to make it perform better or at least make it easier to work with. But I hope to have shown at least that an IMGUI can be implemented as a State Monad in a functional language. On top of that we have shown that via this work it is possible to implement something very much like The Elm Architecture with this abstraction.

It is not impossible to do something like this in Retained mode GUIs (as shown by the existence of gi-gtk-declarative) it would mean effectively building an internal state which is pretty similar to our draw state and every time we update the screen needs to repainted (something that can be overcome by a clever diff algorithm but the same can be done in IMGUI) this has some advantages namely the runtime will take care of notifying us if something interesting happens and updating the screen in the meantime with the current state. The biggest disadvantage is that doing an implementation like this for a Retain mode gui is much harder in comparison to Immediate mode Guis since you need to hook into the usual callbacks in the background, while for Immediate mode it is just a natural consequence of the fact that each widget is just a function.

I hope this article has shown some insight and why I think more people should look into Immediate mode GUIs especially outside of where those are currently mainly used (debug screens in games).

The difficulty here lies in that Cmd msg is probably used (among other things) for longish running background processes in which case multiple messages could be returning at the same time in combination with a msg from the main view port, all these need to be serialized in a way and then be processed by the update function. There are probably clever ways to do this but I am not an expert in Haskell (just know enhough to shoot myself in the foot as they say) ↩︎