Jake Lazaroff

Building a Collaborative Pixel Art Editor with CRDTs

Jake Lazaroff October 10, 2023

Welcome back! In An Interactive Intro to CRDTs, we learned what CRDTs are, and implemented two: a Last Write Wins Register and a Last Write Wins Map. We now have everything we need to build a collaborative pixel art editor, and in this post we'll do just that. This post will be heavier on JavaScript and graphics programming, because I want to show how CRDTs can be used in an actual app. As a reminder, this is what we're building: This post assumes no prior knowledge about CRDTs other than the previous post — so if you haven't read it yet, go back and do that now! — and only a rudimentary knowledge of TypeScript. Building the CRDT Before we start on the CRDT, we need just one more type. We'll store colors as tuples of three integers, representing red, green and blue values. With that out of the way, let's build the CRDT! It'll be a class called PixelData: This is only a thin wrapper over a LWW Map. Almost every method just calls the corresponding LWW Map method! The biggest change involves the static method key. When we're interacting with the pixel art editor, it's most natural to think in terms of (x,y) coordinates. But our LWW Map needs string keys! The key static method serializes coordinate pairs to strings — for example, (15,29) becomes "15,29". Since the LWW Map values are colors in the form of RGB tuples, we can think of this data structure as mapping from pixel coordinates to colors, with each key representing a single pixel. The get method is also slightly different. We want our pixels to default to white. So if no value has been set, we return a default value of [255, 255, 255]. Let's peek under the hood and see what will happen to each register in the map when we draw. Here we can really see how the keys and values interact. When painting the top left square, the key "0,0" is set to the RGB color you have selected ([0, 0, 0] if you didn't change from the default). We can also see how pixels that haven't yet been set default to white. Painting over a pixel overwrites the value and increments the timestamp by one. If you turn the network off and paint the same pixels on each canvas, the ones with the higher timestamps will win out when you turn the network back on. This visualization takes a lot of space for just a few pixels. Instead, let's overlay each pixel with the timestamp of its register: Now we can see how each pixel should interact in the context of the picture. That's it! That's the whole CRDT! As you read through the rest of the post, you'll see that the app mostly doesn't even realize that it's using a CRDT under the hood. Scaffolding the UI Now that we have our CRDT, we need to set up the UI. Here's the HTML and CSS: Then a little JavaScript to instantiate our two editors: Let's break this down a bit: Query the DOM for the two elements and the color input. Store the artboard size. These are the drawable dimensions, and they might be different from the size of the element. For example, the canvas might be 400×400, but we might want our picture to only be 40×40, where each "pixel" the user sees takes up 10×10 actual pixels on the canvas. For clarity, "artboard" will always refer to what the user perceives and interacts with, while "canvas" will refer to the underlying element. Instantiate a PixelEditor class (which we'll write shortly) with a element and the artboard size. When a change happens in either editor, merge the state with the other. Set the editor color whenever the color input changes. HTML color inputs return their color as hex code strings (for example, #845ef7) so we need to do a little work to convert it to RGB format. This code just removes the #, splits the string into two-character chunks and parses each of them from a base 16 integer into a JS number — which is exactly what our RGB type expects. As you can see, we're only simulating the network. Actually writing network code is a separate problem from designing and using the CRDT data structure. Starting the Editor Let's define the PixelEditor class now. Here's the skeleton: The methods are empty (for now), but hopefully this gives a decent idea of what the shape of this program will be. At a high level, when the user draws on the canvas with their mouse, the PixelEditor… Receives DOM events (handleEvent) and sets the selected color (#paint) in its PixelData CRDT. Draws to the canvas (#draw) based on its PixelData CRDT value. Notifies any listeners (#notify) that its data has changed. On the other end, when a PixelEditor receives state from a peer (receive) it updates its own PixelData CRDT and then draws to the canvas (#draw). Cool, so let's start filling in those missing PixelEditor methods. First up, the constructor: Store the element and get the 2D rendering context. Store the artboard size. We'll use this later to convert between the artboard resolution and the canvas resolution. Listen for pointerdown, pointermove and pointerup events. These will be triggered when the user interacts with the canvas. Resize the canvas size to match the dimensions of the element. Now onto the instance methods. First up, color: This is just a setter that takes an RGB tuple and sets the drawing color. You might remember that when we set up our HTML, we called this setter from outside the class in response to input events on the color input. Next, handleEvent: This handles all three types of pointer events. Let's go through them one by one: pointerdown is triggered when the user depresses the mouse button or touches their finger to the screen. Calling setPointerCapture on the element "captures" the pointer, which lets us figure out whether discrete events are part of one continuous drag. We also want to draw a pixel, which uses the same logic as the pointermove event, so we fall through to the next switch case. pointermove is triggered when the pointer, uh, moves. At the top, we check whether the pointer is captured, so we can ignore mouse events if the user isn't holding down the mouse button. Then, we convert from canvas pixels to artboard pixels and call the #paint method to draw the pixel on the canvas. pointerup is triggered when the user releases the mouse button or removes their finger from the screen. We clean up by calling releasePointerCapture. Since we've just called the #paint method, let's see what it looks like: Simple enough: if the given coordinates are inside the artboard, it sets the coordinates to the active color in #data (an instance of the PixelData class we defined before) and then draws to the canvas. Like I said, we don't really care at this point that PixelData class is actually a CRDT — as far as the PixelEditor class is concerned, it's just setting a color in its data. Now, #draw. The basic idea is that we'll allocate a buffer — a contiguous chunk of memory, like an array — and then we write the raw pixel data there. Once we've done that, the canvas API lets us draw the raw pixel data onto the canvas. Here's what it looks like: Slight tangent into how colors are represented in memory. RGB colors consist of three channels — red, green and blue — each of which is a single number between 0 and 255, or eight bits (one byte). That's 24-bit color. The canvas API uses 32-bit color, which adds one extra channel — alpha, or transparency — which is also a single number between 0 and 255. So each pixel takes up up four bytes. First, we need to allocate a buffer to hold the pixel data. Since we know the pixel dimensions of the artboard, we can calculate how big of a buffer we need: four bytes per pixel times the artboard width times the artboard height. From there, we iterate over the rows and columns of the artboard, calculating each pixel's byte offset into the buffer. Then we write the pixel color values into the next four bytes of the buffer following that offset. If you're not familiar with how to calculate the offset, try the playground below. Hover over different pixels in the "artboard" at the top or the "buffer" at the bottom to see how they correspond to each other. The first four bytes (0–3) are the red, green, blue and alpha channels of the top left pixel. The next four bytes (4–7) are the pixel second from the left on the top row — and so on, until we hit the top right pixel. Then, we wrap around — the next four bytes of the buffer are the leftmost pixel on the second row — and continue going, until finally we get to the bottom right pixel in the last four bytes of the buffer. Finally, we draw that buffer to the canvas. Phew! At this point, we have a fully functional pixel art editor without the peer-to-peer parts: Before we can connect the two editors, we need to fix a big issue with the drawing. You've probably noticed it already: if you move quickly, there are gaps between the pixels. Drawing Lines The problem is that events don't necessarily get triggered as fast as the user can move their cursor, which means that the coordinates for each call to #paint might not be next to each other. We can fix this by storing the pixel coordinates of the pointer during the previous event, and drawing a line between them and the current coordinates. Buckle up, because we're about to make a bunch of changes to our PixelEditor class. First, we'll add a private #prev property that holds either an (x,y) coordinate pair, or undefined: Then, we'll modify our handleEvent method. We need to store the cursor's coordinates on the artboard in #prev as the very last step in the pointermove case, right before the break. Then, in the pointerup case, we need to reset #prev to undefined: The biggest changes are in our #paint method, where we need to implement the line drawing. There are a bunch of algorithms for doing this; we'll use one called Digital Differential Analyzer. That link explains the steps in detail, so I'll just skip to the implementation: We're now drawing smooth, connected lines! Check it out: Syncing state Finally, we're ready to connect these two canvases together. Whenever one peer makes a change, we'll send its state to the other. After that peer's PixelData CRDT merges the incoming state into its own, both canvases will have converged upon the same state. You might remember the PixelEditor property #listeners from way back, when we wrote the skeleton of the class. We're about to put it to use. First, we'll fill out the onchange setter: This takes an callback function and adds it to #listeners. Next, we need a way to notify the listeners that the data changed. That method is called #notify: It grabs the current state from the PixelData CRDT stored in #data. Then, it iterates through each listener and calls it with that state. Finally, we need to actually call our #notify method. We'll nestle that right at the end of #paint, so that any time we change the state we also notify all the listeners: That takes care of the sending. On the other end, we need to merge the data into our local state when we receive it from another peer. This last method is called receive: We're familiar with this pattern by now: when a CRDT has to merge some state, it sends parts of it to the appropriate CRDTs. In this case, PixelEditor isn't a CRDT, so we're just sending the whole thing off to #data to be merged. And that's it! Take a look at our two connected pixel art editors. Fixing Timestamps We're almost done, but there's one more optimization I want to make. We know that under the hood, each pixel is a LWW Register, which means it has a timestamp. Peers will compare those timestamps when merging their state. But right now, if you click and drag around, the timestamps get weirdly high — especially if you go slowly. Here's a playground with a lower resolution that shows each pixel's timestamp. The problem is pointer events can fire a lot — often, multiple times on the same artboard pixel. To solve this, we'll keep a set of each pixel we've painted during a single drag operation, and ensure we don't change any pixels already in the set. First, let's add a set of all the keys we've painted to our PixelEditor class: Next, we'll add a #checkPainted method. It will take an (x,y) coordinate pair and return whether it's in the set of painted pixels. It also adds the coordinates to the set, so any successive calls to #checkPainted with the same coordinates return true. The order of these statements is important: we need to first check whether the pixel was already painted, then add it to the set, and finally return whether it was in the set before we added it. We'll use #checkPainted in the #paint method before we set a color for any pixel: Finally, on the pointerup event, just as we reset #prev, we also need to reset #painted: Now, each pixel's timestamp will be incremented by at most 1 during each drag: The End! We made it! We have a completed collaborative pixel art editor, built with CRDTs. Take a look at what we've built and give yourself a pat on the back. If you'd like to play with this on your own, I've made a CodeSandbox with everything we've written here. Next Steps We've learned about CRDTs and built an actual collaborative app with them — but we can still improve our design. Check out the surprise part three: Making CRDTs 98% More Efficient Reading List Hopefully these posts have made you interested in learning even more about CRDTs! Here's a list of articles I leaned on heavily to write this one: Designing Data Structures for Collaborative Apps introduces some primitive CRDTs and shows how to combine them. Reading this article convinced me that I don't need a math PhD to engage with this topic. An introduction to Conflict-Free Replicated Data Types is a tutorial series with interactive code samples that gives a good practical overview of CRDTs. A CRDT Primer Part I: Defanging Order Theory and A CRDT Primer Part II: Convergent CRDTs are good layman's overviews of the math behind state-based CRDTs.

Discussion in the ATmosphere