Originally posted in Hackernoon
Back in 2013 I set out to build a minimalist set of tools for developing web applications. Perhaps the best thing that came out of that process was gotoB, a client-side, pure JS frontend framework written in 2k lines of code.
I was motivated to write this article after going into a rabbit hole of reading interesting articles by authors of very successful frontend frameworks:
- First it was Rich Harris writing about introducing runes to Svelte, which are based on signals.
- That lead me to two articles from Ryan Carniato, one explaining the evolution of signals and another one that describes signals, particularly in the context of SolidJS.
- Ryan’s article lead me into two articles by Michael Westrate, where he describes the principles behind MobX and how they are implemented.
What got me excited about these articles is that they talk about the evolution the ideas behind what they build; the implementation is just a way to make them real, and the only features discussed are those that are so essential as to represent ideas themselves.
By far, the most interesting aspect of what came out of gotoB is the ideas that developed as a result of facing the challenges of building it. That’s what I want to cover here.
Because I built the framework from scratch, and I was trying to achieve both minimalism and internal consistency, I solved four problems in a way that I think is different to the way in which most frameworks solve the same problems.
These four ideas are what I want to share with you now. I do this not to convince you to use my tools (though you’re welcome to!), but rather, hoping that you might be interested in the ideas themselves.
Idea 1: object literals to solve templating
Any web application needs to create markup (HTML) on the fly, based on the state of the application.
This is best explained with an example: in an ultra-simple todo list application, the state could be a list of todos: ['Item 1', 'Item 2']
. Because you’re writing an application (as opposed to a static page), the list of todos must be able to change.
Because state changes, the HTML that makes the UI of your application has to change with the state. For example, to display your todos, you could use the following HTML:
<ul>
<li>Item 1</li>
<li>Item 2</li>
</ul>
If the state changes and a third item is added, your state will now look like this: ['Item 1', 'Item 2', 'Item 3']
; then, your HTML should look like this:
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
</ul>
The problem of generating HTML based on the state of the application is usually solved with a templating language, which inserts programming language constructs (variables, conditionals and loops) into pseudo-HTML that gets expanded into actual HTML.
For example, here are two ways in which this can be done in different templating tools:
// Assume that `todos` is defined and equal to ['Item 1', 'Item 2', 'Item 3']
// Moustache
<ul>
{{#todos}}
<li>{{.}}</li>
{{/todos}}
</ul>
// JSX
<ul>
{todos.map((item, index) => (
<li key={index}>{item}</li>
))}
</ul>
I was never fond of these syntaxes that brought logic to HTML. Realizing that templating required programming, and wanting to avoid having a separate syntax for it, I decided to instead bring HTML to js, using object literals. So, I could simply model my HTML as object literals:
['ul', [
['li', 'Item 1'],
['li', 'Item 2'],
['li', 'Item 3'],
]]
If I wanted to then use iteration to generate the list, I could simply write:
['ul', items.map ((item) => ['li', item])]
And then use a function that would convert this object literal into HTML. In this way, all the templating can be done in JS, without any templating language or transpilation. I use the name liths to describe these arrays that represent HTML.
To my knowledge, no other JS framework approaches templating quite this way. I did some digging and found JSONML, which uses almost the same structure to represent HTML in JSON objects (which are almost the same as JS object literals), but found no framework built around it.
Mithril and Hyperapp get quite close to the approach I used, but they still use function calls for each element.
// Mithril
m("ul", [
m("li", "Item 1"),
m("li", "Item 2")
])
// hyperapp
h("ul", [
h("li", "Item 1"),
h("li", "Item 2")
])
The approach of using object literals worked well for HTML, so I extended it to CSS and now generate all my CSS through object literals as well.
If for some reason you are in an environment where you cannot transpile JSX or use a templating language, and you don’t want to concatenate strings, you can use this approach instead.
I’m not sure whether the Mithril/Hyperapp approach is better than mine; I do find that when writing long object literals representing liths, I sometimes forget a comma somewhere and that can sometimes be tricky to find. Other than that, no complaints really. And I love the fact that the representation for HTML is both 1) data and 2) in JS. This representation can actually function as a virtual DOM, as we’ll see when we get to Idea #4.
Bonus detail: if you want to generate HTML from object literals, you only have to solve the following two problems:
- Entityify strings (ie: escape special characters).
- Know which tags to close and which ones not to.
Idea 2: a global store addressable through paths to hold all application state
I was never fond of components. Structuring an application around components requires placing the data belonging to the component inside the component itself. This makes it hard or even impossible to share that data with other parts of the application.
In every project I worked on, I found that I always needed some parts of the application state to be shared between components that are quite far from each other. A typical example is the username: you might need this in the account section, and also in the header. So where does the username belong?
Therefore, I decided early on to create a simple data object ({}
) and stuff all my state there. I called it the store. The store holds the state for all parts of the app, and can be used therefore by any component.
This approach was somewhat heretical back in 2013-2015, but has since gained prevalence and even dominance.
What I think is still quite novel is that I use paths to access any value inside the store. For example, if the store is:
{
user: {
firstName: 'foo'
lastName: 'bar'
}
}
I can use a path to access (say) the lastName
, by writing B.get ('user', 'lastName')
. As you can see, ['user', 'lastName']
is the path to 'bar'
. B.get
is a function that accesses the store and returns a specific part of it, indicated by the path you pass to the function.
In contrast to the above, the standard way to access reactive properties is to reference them through a JS variable. For example:
// Svelte
let { firstName, lastName } = $props();
firstName = 'foo';
lastName = 'bar';
// Knockout
const firstName = ko.observable('foo');
const lastName = ko.observable('bar');
// mobx
class UserStore {
firstName = 'foo';
lastName = 'bar';
constructor() {
makeAutoObservable(this);
}
}
const userStore = new UserStore();
// SolidJS
const [firstName, setFirstName] = createSignal('foo');
const [lastName, setLastName] = createSignal('bar');
This, however, requires you to keep a reference to firstName
and lastName
(or userStore
) anywhere that you need that value. The approach that I use only requires you to have access to the store (which is global and available everywhere) and allows you to have fine-grained access to it without defining JS variables for them.
Immutable.js and the Firebase Realtime Database do something much closer to what I did, albeit they are working on separate objects. But you could potentially use them to store everything in a single place that could be granularly addressable.
// Immutable.js
let store = Map({
user: Map({
firstName: 'foo',
lastName: 'bar'
})
});
const firstName = store.getIn(['user', 'firstName']); // 'foo'
// Firebase
const db = firebase.database();
db.ref('user').set({
firstName: 'foo',
lastName: 'bar'
});
db.ref('user/firstName').once('value').then(snapshot => {
const firstName = snapshot.val(); // 'foo'
});
Having my data in a globally accessible store that can be granuarly accessed through paths is a pattern that I’ve found extremely useful. Whenever I write const [count, setCount] = ...
or something like that, it feels redundant. I know I could just do B.get ('count')
whenever I need to access that, without having to declare and pass around count
or setCount
.
Idea 3: every single change is expressed through events
If Idea #2 (a global store accessible through paths) liberate data from components, Idea #3 is how I liberated code from components. To me, this is the most interesting idea in this article. Here it goes!
Our state is data that, by definition, is mutable (for those using immutability, the argument still stands: you still want the latest version of the state to change, even if you keep snapshots of older versions of the state). How do we change the state?
I decided to go with events. I already had paths to the store, so an event could be simply the combination of a verb (like set
, add
or rem
) and a path. So, if I wanted to update user.firstName
, I could write something like this:
B.call ('set', ['user', 'firstName'], 'Foo')
This is definitely more verbose than writing:
user.firstName = 'Foo';
But it allowed me to write code that would respond to a change in user.firstName
. And this is the crucial idea: in an UI, there are different parts that are dependent on different parts of the state. For example, you could have these dependencies:
- Header: depends on
user
andcurrentView
- Account section: depends on
user
- Todo list: depends on
items
The big question I faced was: how do I update the header and the account section when user
changes, but not when items
changes? And how do I manage these dependencies without having to make specific calls like updateHeader
or updateAccountSection
? These types of specific calls represent “jQuery programming” at its most unmaintainable.
What looked like a better idea to me was to do something like this:
B.respond ('set', [['user'], ['currentView']], function (user, currentView) {
// Update the header
});
B.respond ('set', ['user'], function (user) {
// Update the account section
});
B.respond ('set', ['items'], function (items) {
// Update the todo list
});
So, if a set
event is called for user
, the event system will notify all the views that are interested in that change (header & account section), while leaving the other views (todo list) undisturbed. B.respond
is the function I use to register responders (which are usually called “event listeners” or “reactions”). Note that the responders are global and not bound to any components; they are, however, listening only to set
events on certain paths.
Now, how does a change
event get called in the first place? This is how I did it:
B.respond ('set', '*', function () {
// Assume that `path` is the path on which set was called
B.call ('change', path);
});
I’m simplifying a bit, but that’s essentially how it works in gotoB.
What makes an event system more powerful than mere function calls is that an event call can execute 0, 1 or multiple pieces of code, whereas a function call always calls exactly one function. In the above example, if you call B.call ('set', ['user', 'firstName'], 'Foo');
, two pieces of code are executed: that which changes the header and that which changes the account view. Note that the call to update firstName
doesn’t “care” who is listening to this. It just does its thing and lets the responder pick up the changes.
Events are so powerful that, in my experience, they can replace computed values, as well as reactions. In other words, they can be used to express any change that needs to happen in an application.
A computed value can be expressed with an event responder. For example, if you want to compute a fullName
and you don’t want to use it in the store, you can do the following:
B.respond ('set', 'user', function () {
var user = B.get ('user');
var fullName = user.firstName + ' ' + user.lastName;
// Do something with `fullName` here.
});
Similarly, reactions can be expressed with a responder. Consider this:
B.respond ('set', 'user', function () {
var user = B.get ('user');
var fullName = user.firstName + ' ' + user.lastName;
document.getElementById ('header').innerHTML = '<h1>Hello, ' + fullName + '</h1>';
});
If you disregard for a minute the cringe-inducing concatenation of strings to generate HTML, what you see above is a responder executing a “side-effect” (in this case, updating the DOM).
(Side note: what would be a good definition of a side-effect, in the context of a web application? To me, it boils down to three things: 1) an update to the state of the application; 2) a change to the DOM; 3) sending an AJAX call).
I found that there’s really no need for a separate lifecycle that updates the DOM. In gotoB, there are some responder functions that update the DOM with the help of some helper functions. So, when user
changes, any responder (or more precisely, view function, since that’s the name I give to responders that are tasked with updating a part of the DOM) that depends on it will execute, generating a side effect that ends up updating the DOM.
I made the event system predictable by making it run the responder functions in the same order, and one at a time. Asynchronous responders can still run as synchronous, and the responders that come “after” them will wait for them.
More sophisticated patterns, where you need to update the state without updating the DOM (usually for performance purposes) can be added by adding mute verbs, like mset, which modify the store but don’t trigger any responders. Also, if you need to do something on the DOM after a redraw happens, you can simply make sure that that responder has a low priority and runs after all other responders:
B.respond ('set', 'date', {priority: -1000}, function () {
var datePicker = document.getElementById ('datepicker');
// Do something with the date picker
});
The approach above, of having an event system using verbs and paths and a set of global responders that get matched (executed) by certain event calls, has another advantage: every event call can be placed in a list. You can then analyze this list when you’re debugging your application, and track changes to the state.
In the context of a frontend, here’s what events and responders allow:
- To update parts of the store with very little code (just a bit more verbose than mere variable assignment).
- To have parts of the DOM auto-update when there is a change on the parts of the store on which that part of the DOM depends.
- To not have any part of the DOM auto-update when it is not needed.
- To be able to have computed values and reactions that are not concerned with updating the DOM, expressed as responders.
This is what they allow (in my experience) to do without:
- Lifecycle methods or hooks.
- Observables.
- Immutability.
- Memoization.
It’s all really just event calls and responders, some responders are just concerned with views, and others are concerned with other operations. All the internals of the framework are just using user space.
If you’re curious about how this works in gotoB, you can check this detailed explanation.
Idea 4: a text diff algorithm to update the DOM
Two-way data binding now sounds quite dated. But if you take a time machine back to 2013 and you tackle from first principles the problem of redrawing the DOM when the state changes, what would sound more reasonable?
- If the HTML changes, update your state in JS. If the state in JS changes, update the HTML.
- Every time the state in JS changes, update the HTML. If the HTML changes, update the state in JS and then re-update the HTML to match the state in JS.
Indeed, option 2, which is the unidirectional data flow from state to DOM, sounds more convoluted, as well as inefficient.
Let’s now make this very concrete: in the case of an interactive <input>
or <textarea>
that is focused, you need to recreate parts of the DOM with every user’s keystroke! If you’re using unidirectional data flows, every change in the input triggers a change in the state, which then redraws the <input>
to make it match exactly what it should be.
This sets a very high bar for DOM updates: they should be quick and not hamper user interaction with interactive elements. This is not an easy problem to tackle.
Now, why did unidirectional data from state to DOM (JS to HTML) won? Because it’s easier to reason about. If the state changes, it doesn’t matter from where this change came from (could be an AJAX callback bringing data from the server, could be a user interaction, could be a timer). The state changes (or rather, is mutated) in the same way always. And the changes from the state always flow into the DOM.
So, how does one go about performing DOM updates in an efficient way that doesn’t hamper user interaction? This usually boils down to performing the minimum amount of DOM updates that will get the job done. This is usually called “diffing”, because you’re making a list of differences that you need to take an old structure (the existing DOM) and convert it to a new one (the new DOM after the state is updated).
When I started working on this problem around 2016, I cheated by taking a look at what react was doing. They gave me the crucial insight that there was no generalized, linear-performance algorithm for diffing two trees (the DOM is a tree). But, stubborn if anything, I still wanted a general purpose algorithm to perform the diffing. What I particularly disliked about React’s (or almost any framework’s for that matter) is the insistence that you need to use keys for contiguous elements:
function MyList() {
const items = ['Item 1', 'Item 2', 'Item 3'];
return (
<ul>
{items.map((item, index) => (
<li key={index}>{item}</li>
))}
</ul>
);
}
To me, the key
directive was superflous, because it didn’t have anything to do with the DOM; it was just a hint to the framework.
Then I thought about trying a textual diff algorithm on flattened versions of a tree. What if I flattened both trees (the old piece of DOM that I had and the new piece of DOM that I wanted to replace it with) and compute a diff
on it (a minimum set of edits), so that I could go from the old one to the new one in the smaller number of steps?
So I took the Myers’ algorithm, the one that you use every time you run git diff
, and put it to work on my flattened trees. Let’s illustrate with an example:
var oldList = ['ul', [
['li', 'Item 1'],
['li', 'Item 2'],
]];
var newList = ['ul', [
['li', 'Item 1'],
['li', 'Item 2'],
['li', 'Item 3'],
]];
As you can see, I’m not working with the DOM, but with the object literal representation we saw on Idea 1. Now, you’ll notice that we need to add a new <li>
to the end of the list.
The flattened trees look like this:
var oldFlattened = ['O ul', 'O li', 'L Item 1', 'C li', 'O li', 'L Item 2', 'C li', 'C ul'];
var newFlattened = ['O ul', 'O li', 'L Item 1', 'C li', 'O li', 'L Item 2', 'C li', 'O li', 'L Item 3', 'C li', 'C ul'];
The O
stands for “open tag”, the L
for “literal” (in this case, some text) and the C
stands for “close tag”. Note that each tree is now a list of strings, and there are no longer any nested arrays. This is what I mean by flattening.
When I run a diff on each of these elements (treating each item in the array like it is a unit), I get:
var diff = [
['keep', 'O ul']
['keep', 'O li']
['keep', 'L Item 1']
['keep', 'C li']
['keep', 'O li']
['keep', 'L Item 2']
['keep', 'C li']
['add', 'O li']
['add', 'L Item 3']
['add', 'C li']
['keep', 'C ul']
];
As you probably inferred, we’re keeping most of the list, and adding a <li>
towards the end of it. Those are the add
entries you see.
If we now changed the text of the third <li>
from Item 3
to Item 4
and ran a diff on it, we’d obtain:
var diff = [
['keep', 'O ul']
['keep', 'O li']
['keep', 'L Item 1']
['keep', 'C li']
['keep', 'O li']
['keep', 'L Item 2']
['keep', 'C li']
['keep', 'O li']
['rem', 'L Item 3']
['add', 'L Item 4']
['keep', 'C li']
['keep', 'C ul']
];
I don’t know how mathematically inefficient this approach is, but in practice it has worked quite well. It only performs poorly when diffing large trees that have a lot of differences between them; when that ocassionally happens, I resort to a 200ms timeout to interrupt the diffing and simply replace entirely the offending part of the DOM. If I didn’t use a timeout, the entire application would stall for some time until the diff completed.
A lucky advantage of using the Myers diff is that it prioritizes deletions over insertions: this means that if there’s an equally efficient choice between removing an item and adding an item, the algorithm will remove an item first. Practically, this allows me to grab all the eliminated DOM elements and be able to recycle them if I need them later in the diff. In the last example, the last <li>
is recycled by changing its contents from Item 3
to Item 4
. By recycling elements (rather than creating new DOM elements) we improve performance to a degree where the user doesn’t realize that the DOM is being constantly redrawn.
If you’re wondering how complex it is to implement this flattening & diffing mechanism that applies changes to the DOM, I managed to do it in 500 lines of ES5 javascript, and it even runs in Internet Explorer 6. But, admittedly, it was perhaps the hardest piece of code I ever wrote. Being stubborn comes at a cost.
UPDATE: I just found out that Elm does something similar to this, but using Wu’s diff algorithm (which is an improvement over Myers’). However, it still requires keys to be passed to list elements, so I’m not sure whether Elm is using a diff algorithm in such a general (or better said, lazy) way as I do.
Conclusion
Those are the four ideas I wanted to present! They’re not fully original but I hope they will be both novel and interesting to some. Thanks for reading!