Wishful Coding

Didn't you ever wish your
computer understood you?

Hipflask: collaborative apps with CouchDB and React

I’m currently developing Mosaic, a schematic editor built on web technology. In the process of adding online collaboration, I wrote a library to help me manage and synchronize state. The library is called Hipflask.

In my previous post I wrote about different strategies for dealing with conflicts when multiple users are editing the same object. In this post I want to focus more on the library I wrote to manage state.

Hipflask is basically a ClojureScript/Reagent atom backed by PouchDB.

Mosaic is written in Reagent, a ClojureScript React wrapper. The fundamental building block of Reagent is the Ratom (Reagent atom), a data structure that allows atomic updates by applying functions to its state. It then monitors state updates, and redraws components that depend on the Ratom. It’s quite elegant.

The semantics of a Clojure atom (after which ClojureScript and Reagent atoms are modelled) is that there is a compare-and-set! function that atomically updates the state if the expected old state matches the current state. swap! is a wrapper over that which repeatedly applies a function until it succeeds.

The neat thing is that CouchDB update semantics are exactly like compare-and-set!. You update a document, and if the revision matches it succeeds, if there is a conflict the update is rejected. You can layer swap! behavior on top of that which applies a function to a document repeatedly until there is no conflict.

So what you get if you combine PouchDB with a Ratom is that you can apply functions to your state, which will resolve database conflicts and rerender your app. The PouchDB atom even listens for database changes, so that remote changes will update the cache and components in your app that depend on it.

Basically it transparently applies functions to the database and updates your UI on database changes.

One thing that sets Hipflask apart from my previous CouchDB atom is that it manages a prefix of keys. This plays well with sharding, so you can have a PouchDB atom containing all foo:* documents. Worth noting is that it’s only atomic per document, and assumes the first argument to your update function is a key or collection of keys that will be updated.

You can use Hipflask in two main ways. The first way is offline first. The atom is backed by a local PouchDB which can by replicated to a central database. This way the app remains fully functional offline but replication conflicts are not handled automatically. You can also make it talk directly to a central CouchDB, in which case it does not work offline but conflicts in the central database are resolved automatically.

As a demo of the library, I made a Global Cookie Clicker app. It’s an offline-first collaborative cookie clicker. You can click cookies offline, and they are then synchronized to a central CouchDB. The entire code is less than 70 lines.

It basically defines a PouchDB atom

(def db (hf/pouchdb "cookies"))
(def rginger (r/atom {}))
(def ginger (hf/pouch-atom db "gingerbread" rginger))

And then when you click the button it calls this function that increments your cookie document.

(defn make-cookie [a n _]
  (swap! a update-in [(str (.-group a) hf/sep me) :cookies] #(+ % n)))

This library has been very useful in building Mosaic, and I hope it is useful to others as well.

Keep track of how much you drink with Lego

It’s important to stay hydrated, but hard to know how much you are actually drinking. Some number of liters or cups does not easily translate to absentmindedly sipping from an oddly sized mug. There are health apps that let you manually track how much you eat and drink, but that is such a hassle. When I’m in the zone, it’s already hard to refill my cup, let alone enter it into some app.

So I made a smart cup holder that weighs my cup and keeps track of how much I drink.

This setup consists of the following parts:

The basic idea is quite simple. Every time the cup is placed down, its weight is recorded. If the weight is less than the last recorded weight, the difference is assumed to be consumed by me, and added to the total. The Arduino code is indeed quite simple.

The only noteworthy part of the code was calibrating the load cell. I did this simply by taking a measuring cup and pouring 100ml of water at a time into a cup. Repeat a few times with a few different cups and determine the scale factor.

The cup holder had to go through several design iterations. The first ones had either too much friction, inconsistent weight measurement, or just weren’t structurally stable enough. The load cell is not a Lego part, so the tricky part was making the cup rest only on the load cell. I used a Technic Flex-System Hose inserted into Technic Plates, which just about align with the load cell screw holes.

Fun fact: a 1 x 2 Grille is slightly thicker than a 1 x 2 Tile. Maybe I’m imagining things, but it seems to be the difference between a tight fit or not. With normal tiles I had to insert some pieces of paper to keep it from flexing.

With the mechanical parts out of the way, I went on to displaying the information. Maybe a 7-segment display would have been the obvious choice, but I only had one digit. I do have a full colour LCD panel, but that seems overkill and distracting.

If it’s going to sit on my desk and I’m going to power it from my computer’s USB ports, I might as well display the information on my computer. A little icon should be less obtrusive than a blinky LED thing. I may have underestimated how easy it is to get USB serial data in my taskbar, but in the end it worked out.

I forked a Gnome extension that displays the status of wireless earbuds, which it reads from the log file of some daemon. All I had to do was change the icons a bit and make it read form the TTY… right? Well turns out reading a TTY from Gnome JavaScript is a bit painful because there don’t seem to be any functions to configure it. It would also randomly close and reset for some reason. In the end I shelled out to stty to set up the TTY to not block and then just get the whole contents of it. In the process I learned that a JS extension can definitely hang and crash your entire Gnome shell. Great design that.

So that’s it, right? Throw the code on Github, job done. Well, except I decided that in the name of reproducibility, I should make building instructions for the Lego cup holder. I used to sell Lego Mindstorms building instructions, so I’ve done it before… many years ago. So I installed LeoCAD and started making the model.

LDraw model

At first it went pretty well, but then I ran into two problems. First of all, I spilt the model into a base and a cupholder submodel. But when generating building instructions, LeoCAD just inserted the submodel as a part. The second challenge was making the flexible hoses.

I tried to open the model with LPub3D, which did render the submodel instructions, but did not show the flexible hose correctly because LeoCAD uses a non-standard format for those. Then I installed Bricklink Studio under Wine, and redid the flexible hose there. For some reason Bricklink Studio did not render the parts list in the building instructions, so I ended up going back to LPub3D to render the final building instructions. Phew!

Update: I have now hooked up my smart cupholder to InfluxDB using the tail input using the following Telegaf configuration.

InfluxDB dashboard

[[inputs.tail]]
  files = ["/dev/ttyACM0"]

  # this avoid seeks on the tty
  from_beginning = true
  pipe = true

  # parse csv
  data_format = "csv"
  csv_column_names = ["cup present", "current weight", "last weight", "weight difference", "total consumed"]
  csv_delimiter = ";"

The limits of conflict-free replicated data types

Imagine you’re writing a collaborative application where multiple users are editing a document at the same time. How do you resolve conflicting edits?

The YOLO solution is last-write-wins, resulting in data loss. The git/CouchDB solution is explicit conflict resolution, by asking the user or using domain-specific logic. The cool kids solution is to use a conflict-free replicated data type (CRDT), promising to never create conflicts in the first place!

A simple to understand CRDT is the add-only set. If two people add different things to a set, you simply add all the things. As long as you never remove things, you can always resolve concurrent edits.

Computer science has given use a whole set of these CRDTs, and people have built nice libraries out of them that promise that as long as you use these data structures you can have a collaborative app that never has conflicts or data loss. Perfect!

Except, the fact that they are conflict free does not mean that the resolution is what the user expected.

Imagine a collaborative drawing program. Lets keep it simple and say the drawing is an add-only set of lines. Perfect, anyone can add lines and there will never be a conflict!

So now Alice and Bob decide to draw a landscape together. Alice starts drawing happy little trees, but Bob suffers an internet glitch and goes offline for a minute. While offline he draws some big snowy mountains. When Bob comes back online, the add-only set neatly merges all their lines without conflict, resulting in a hot mess.

This is not an imaginary scenario. I was showing Mosaic to a friend, but their ad blocker blocked synchronization. He drew a nice circuit, unaware of what was already there, and when he disabled his ad blocker, this was the result. Not a single conflict, but not a functional circuit either. (it’s a band pass filter and a differential pair, in case you’re wondering)

a band pass filter and differential pair schematic smushed together

Mosaic is not in fact using a true CRDT, rather it is using CouchDB in a way that avoids conflicts. Each component is its own document, so there aren’t conflicts unless two people try to drag the same component at the same time. There is no way to resolve that situation and let both people get what they want. CouchDB has a pretty good section on designing an application to work with replication by the way.

In short, no matter if you use a CRDT or something else, there are situations that cannot be automatically resolved in a way that is generic and does what the user expects. You cannot wish this problem away. To the user last-write-wins data loss and seamlessly-smushed-together data loss are indistinguishable. So how do you handle it?

I think CRDTs are a useful tool, but ultimately suffer from the same “wishing the problem away” attitude as last-write-wins. CouchDB has the right mentality of designing for conflict avoidance, and requiring domain-specific conflict resolution, but it is not a full solution either. As you saw with Mosaic, you can get too good at avoiding conflicts, leading to a result that is unexpected to the end user. You need to think of domain-specific solutions.

In the case of Mosaic the plan is that device-level changes are an explicit conflict that requires user review. If you change the transistor width, and I change the length, that is not something that should be resolved automatically, even though it technically could be. We have both changed the very important W/L ratio, and combining these edits would likely have the wrong result.

At the schematic level, it’s hard to tell if some changes are intentional. I don’t think you can or should capture overlapping components at the database level, and the best thing to do is offer Electrical Rule Checks(ERC) for likely mistakes, and good edit history to recover from them.

Published on