Implementing an Undo/Redo System in a Complex Visual Application

Undo/redo systems in creative software are often invisible heroes—until they fail. As vital as they are though, they’re expected to “just work,” and building one that does—especially in a complex, visual app—is far from simple.

A while ago, I deemed it necessary to design and implement a robust yet intuitive undo/redo system for Alkemion Studio, a visual brainstorming and writing tool tailored to TTRPGs. The project was full of challenges but ultimately very rewarding. This post dives into how the system works, what made it tricky, and some of the thought processes that emerged during development.

So, what makes undo/redo in Alkemion Studio so complex? Unlike linear text editors for example, users here interact across multiple contexts: moving tokens on a board, editing rich text in an editor window, tweaking metadata—all in different UI layers. A context-blind undo/redo system risks not just confusion but serious, sometimes destructive, bugs.

To raise the stakes, every undoable action also doubles as a structured autosave event—making reliability even more critical.

The guiding principle from the beginning was this:

Undo/redo must be intuitive and context-aware. Users should not be allowed to undo something they can’t see.

With that in mind, let’s take a look at how the framework works!

Context-Awareness

Before anything else, we must address the main challenge and hurdle of creating such a system: context, which we’ll define simply as where the user is in the application and what actions they can do.

In many apps, there’s only a single context, so undo/redo systems can be simple and the codebase could do with a lighter framework.

Alkemion Studio, however, allows users to act on their data from multiple locations, with different actions available—and not all effects are visible everywhere. That last part is crucial.

Without context-awareness, users might undo something unrelated to what they’re working on—potentially offscreen—without realizing it.

That’s why our philosophy has been and remains:

Users shouldn't be able to undo something they can't see.

While this may sound restrictive, it’s actually very protective. It prevents confusing situations where an undo operation seems to have “done nothing” while having actually changed something offscreen without warning.

This could lead to data loss if users unknowingly continue working after such a change. This is exactly what we aim to avoid with a context-aware system.

Let’s look at two example cases to clarify different situations:

Changing a Node’s featured image
This can be done from the Board, Editor, or Node Table—the main contexts. Since the change is visible across all three, undoing it in any of them makes sense and gives the user clear visual feedback.
Editing a Token on the Board
This action is Board-specific—only done and seen there. Undoing it from the Editor or Node Table would give no feedback, risking confusion or accidental data loss, especially with editable Tokens like Text Blocks. Even minor data loss can frustrate users over time.

To prevent this, we need a context-aware undo/redo system, one that is granular, robust, and, most importantly, intuitive.

Traditional systems might rely on a single stack; but in a multi-context app like ours, that breaks down fast. As soon as you switch context, undoing might affect invisible parts of the app, leading to confusion or worse.

So, without further ado, let’s now talk action classes, action stacks, contexts and containers—and look into how we faced this challenge head-on!

System Design and Core Concepts

Before diving into the technical details, note that we'll approach concepts from a language- and environment-agnostic, software architecture perspective. While some examples—using TypeScript—may include constructs like "classes" that don’t exist in all languages, the underlying principles should remain applicable regardless of your chosen stack. Alright, let’s get started!

Action Classes

These are our main building blocks. Actions are the most straightforward structure of this whole framework. Every time the user does something that can be undone or redone, an Action is instantiated via an Action class.

Every Action has both an undo and a redo function, which are called every time the user decides to undo or redo that specific Action. This is the base idea behind the whole system.

Let's look at an example of what the Action class might look like:

As you can see, the base abstract Action class has both an undo and a redo method, whose implementations are left to children classes. From that point on, when we want to create a new undoable action in the application, we define a new child class, give it specific arguments and implement its own undo/redo methods.

Implementation Detail: When we want to implement new Actions in Alkemion Studio, we actually make them inherit from either of two subclasses, which themselves inherit from the base Action class: ActionSingle and ActionGroup.

ActionSingle is the base class used for the majority of actions, which simply links the Action to the global auto-save.

ActionGroup groups multiple ActionSingle instances into one Action, allowing a single undo to reverse a set of related actions. For example, deleting a Node also deletes its Tokens—two separate actions that can be done individually, but from a UX standpoint, should be undone together.

This Action architecture is extremely flexible: instead of storing global application states, we only store very localized and specific data, and we can easily handle side effects and communication with other parts of the application when those Actions come into play. This encapsulation enables fine-grained undo/redo control, clear separation of concerns, and easier testing.

Now that we’ve got our main building blocks, let’s integrate them to the system!

Action Instantiation and Storage

Whenever the user performs an Action in the app that supports undo/redo, an instance of that Action is created. But we can’t just have Actions floating around in random variables scattered throughout the codebase, we need a central hub to manage them.

For this, we are using a dedicated singleton responsible for global management of the undo/redo system across the application. This singleton—referred to as ActionStore in the Alkemion Studio codebase due to its role as a state management store—handles global operations such as context management, Action registration and storage, undo/redo triggers, cleanup tasks, and more.

The ActionStore organizes Actions into Action Volumes—term related to the notion of Action Containers which we’ll cover below—which are objects keyed by Action class names, each holding an array of instances for that class. Instead of a single, unwieldy list, this structure allows efficient lookups and manipulation. Two Action Volumes are maintained at all times: one for done Actions and one for undone Actions.

Here’s a graph:

Graph showing the Action Volume structure

This is how we store Actions that we’ve instantiated and that we’ll look up when we need to undo or redo one. How do we know what to lookup and how though? That’s where context comes in.

Handling Context

Earlier, we discussed the philosophy behind the undo/redo system, why having a single Action stack wouldn’t cut it for this situation, and the necessity for flexibility and separation of concerns.

The solution: a global Action Context that determines which actions are currently “valid” and authorized to be undone or redone.

The implementation itself is pretty basic and very application dependent, it’s a simple getter that returns a string literal based on certain application-wide conditions. Sure, it doesn’t look very pretty, but it gets the job done:

You might also notice in the code above a specific context called “CONTAINER”. This context is a bit different from the others, and we’ll look at what it does and allows us to do a bit further in the post.

Now that we have an Action Context defined at all times, we can use it to determine what actions can be undone/redone and create an actual stack to look up whenever we hit the undo/redo button!

First, to determine which actions can be undone, the simplest solution is to just use a configuration file.

With this configuration file, we can easily determine which actions are undoable or redoable based on the current context. As a result, we can maintain an undo stack and a redo stack, each containing actions fetched from our Action Volumes and sorted by their globalIndex, assigned at the time of instantiation (more on that in a bit—this property pulls a lot of weight).

What the above getters do is parse the Action Volumes and return two stacks—one for undo, one for redo—that contain every Action that can be undone and redone given the current context. The Actions in these two stacks are also sorted based on their globalIndex to make sure they are undone or redone in the correct order—this is very important, messing up the order can have drastic consequences. With these two stacks, we can now very quickly and easily fetch the next relevant action when the user presses the undo or redo button.

Why stacks? You could very well use arrays, lists, vectors or whatever similar structure your language provides, so long as you make sure to add safety measures to prevent mutations that would alter your structure’s order. A stack does just that, it is linear and only allows interactions with the topmost element inside—last in, first out—which just so happens to be the natural sequence that emerges from an undo/redo system.

Also, the getters in the snippet above might not necessarily be ideal as is, given that they would re-compute everything every time they’re accessed. Depending on your needs and tech stack, you might want to memoize the result or use some kind of cache to avoid re-computing everything whenever the getters are used.

Alright, now that we have easily accessible Actions, it’s time to use them!

Triggering Undo/Redo

Let’s go through the process of actually undoing an Action.

Let’s say the user has moved a Token on the Board. When they do so, the "MOVE_TOKEN" Action is instantiated and stored in the undoneActions Action Volume in the ActionStore singleton for later use.

Now, the user realizes that they actually didn’t want to move that Token, so they hit CTRL+Z (or a dedicated button). What happens then?

Well, the ActionStore has two public methods called undoLastAction and redoNextAction that oversee the global process of undoing/redoing when the user triggers those operations.

So when the user hits “undo”, the undoLastAction method is called, and it first checks the current context, and makes sure that there isn’t anything else globally in the application preventing an undo operation.

When the operation has been cleared, the method then peeks at the last authorized action in the undoableActions stack and calls its undo method.

Once the lower level undo method has returned the result of its process, the undoLastAction method checks that everything went okay, and if so, proceeds to move the action from the “done” Action Volume to the “undone” Action Volume—below in the post, we’ll look at the challenges of that, because it involves keeping track of order and indices to fit with the multi-context architecture.

And just like that, we’ve undone an action! The process for “redo” works the same, simply in the opposite direction.

Containers and Isolation

There is an additional layer of abstraction that we have yet to talk about that actually encapsulates everything that we’ve looked at, and that is containers.

Containers—inspired by Docker containers—are self-contained action environments within the app.

This level of abstraction is extremely useful and allow us to have separate and isolated environments when the user enters a certain context like a modal for example. Whenever the user enters one of these “enclosed” UI state, a new container is created, and all Actions within that container are isolated from the global undo/redo stacks, and are instead stored locally in the container’s Action Volumes.

In fact, even the global undo/redo state is managed inside a container, albeit a special one with an id of “host” which can never be downed or destroyed. Doing this gives us the ability to have as many Action environments as we want, all while keeping the exact same architecture and inner-workings as the main environment.

Only one container can be loaded at a time—this is considered the active container. However, multiple containers can be cached in the background and are identified by their IDs.

Containers are also the Supreme Court of action authorization as they have the last word on which actions are authorized inside them. Containers can either be passed a list of allowed actions, a specific context (as an alias for a list of actions already defined in the config), or the string “context” to specify that the containers should allow the actions of the current global context.

Once the user exits a container, Actions that were instantiated inside can either be discarded (eg. the user clicks “cancel” in the modal), or be merged with that of the host with proper indexing.

This containerized architecture lets us treat Actions like transactions—atomic, rollback-able, and local until committed.

So, in the ActionStore, instead of having global Action Volumes like in the previous code snippet, we instead use containers like this:

Multi-Stack Architecture: Ordering and Chronology

Now that we have a broader idea of how the system is structured, we can take a look at some of the pitfalls and hurdles that come with it, the biggest one being chronology, because order between actions matters.

In a linear architecture, chronology is easy to manage. Undo/redo stacks maintain action order: undoing an action simply moves it from the top of the undo stack to the top of the redo stack. But this system is different.

Here, actions move between volumes, which have no inherent order. It might seem like assigning an index to each action at creation would solve this—but it doesn’t. In this multi-context setup, certain conditions can force index changes, and there lies the real pitfall.

Let’s look at an example:

Say the user performs action A in context 1, where it’s only allowed. Then they undo A.

Later, in context 2, they perform action B, which can be undone anywhere.

Now, back in context 1, and they decide to redo A.

Here’s the catch though, if we were to use the original indices to sort the undo stack, A would be before B, but that’s not the behavior the user would expect. When you redo an action in an application, you expect that action to go back to the top of the undo stack, not at the bottom or somewhere in the middle. We need to compensate for that and reassign indices accordingly.

Handling indices involves two main steps: instantiation and undo/redo.

Instantiation:
When creating a new action, check for undone actions outside the current context. If found, increment their indices by 1 to make room for the new action before them. Also increment any "done" actions with indices greater than the insert point. The new action is placed between these adjusted actions and the rest. If no undone actions exist, assign the next highest index.
Undo:
When undoing, check for external undone actions. If any have indices greater than the target undo index, increment them by 1.
Redo:
When redoing, check for external "done" actions. If their indices are greater than the target redo index, decrement them by 1.

This may sound a bit cryptic, but the goal is simple: to keep the action order intuitive for the user.

When a new action is created, it should become the next one the user can undo.
When an action is undone, it should become the next one the user can redo.
When an action is redone, it should once again become the next one the user can undo.

Here’s the implementation of globalIndex assignment when instantiating a new Action (again, this can very well be optimized with caching depending on your needs and implementation):

This is all very important to keep the user experience enjoyable, even when dealing with multiple contexts handling the same data.

Weaknesses and Future Improvements

It’s always important to look at potential weaknesses in a system and what can be improved. In our case, there is one evident pitfall, which is action order and chronology. While we’ve already addressed some issues related to action ordering—particularly when switching contexts with cached actions—there are still edge cases we need to consider.

Specifically, some actions depend on the side effects of others, and this dependency isn't always preserved when users move between contexts.

Imagine Action B relies on the side effects of Action A to function correctly.

In context 1, Action A is undone.
In context 2, Action B is still present.
If the user then undoes Action B in context 2, we've got a problem—because B is now being undone while A, its prerequisite, has already been removed.

We haven’t had to face such edge cases yet in Alkemion Studio, as we’ve relied on strict guidelines that ensure actions in the same context are always properly ordered and dependent actions follow their prerequisites.

However, there’s no guarantee we won’t face that situation in the future, so how do we fix that?

Well, the plan is to use a pre-configured dependency graph, where every action will be able to lookup whether its dependencies are fulfilled before instantiation/execution. Such a graph should theoretically solve these issues and allow us to stray from our guidelines if the situation requires it.

Snippets

Here are some additional full snippets of code now that we have a broader understanding of how the system works.

The full createAction method to register a new action that the user has done:

The undoLastAction and redoNextAction methods:

Some Docker-inspired methods to make me feel like a hacker:

Conclusion

Designing and implementing this undo/redo system has been one of my favorite experiences working on Alkemion Studio. It came with its fair share of unknowns and challenges, but I learned so much and I had a lot of fun along the way.

Seeing it in action, helping real users, and becoming an integral part of a live product is an incredibly gratifying experience—one that’s hard to put into words.

I hope you found this post enjoyable and maybe even helpful if you’re working on something similar.

Thank you so much for reading!

mlacast

Settings