Interleaving Anomalies in Collaborative Text Editors (2019)


Collaborative text editors allow two or more users to concurrently
edit a shared document without merge conflicts. Such systems
require an algorithm to provide convergence, ensuring all clients
that have seen the same set of document edits are in the same
state. Unfortunately convergence alone does not guarantee that
a collaborative text editor is usable. Several published algorithms
for collaborative text editing exhibit an undesirable anomaly in
which concurrently inserted portions of text with a well-defined
order may be randomly interleaved on a character-by-character
basis, resulting in an unreadable jumble of letters. Although this
anomaly appears to be known informally by some researchers in
the field, we are not aware of any published work that fully explains
or addresses it. We show that several algorithms suffer from this
problem, explain its cause, and also identify a lesser variant of the
anomaly that occurs in another algorithm. Moreover, we propose a
specification of collaborative text editing that rules out the anomaly,
and show how to prevent the lesser anomaly from occurring in one
particular algorithm.

interleaving-papoc19.pdf (564.7 KB)


Mirroring @expede’s notes from Roam:

  • TL;DR
    - Many (not all) collaborative text CRDTs end up with interleaved characters
    - RGA has a problem too, with multiple words
    - Our naive Dialog version has yet another anomaly edge case (branching text)
    - Alice = Hi!, Bob = Hey ~> Hi!ey (Shared H, but rest grouped)
    - Their solution is to essentially track the insertion timestamp, pointer of parent element, plus the set of roots (cursor moves) that it knew about when the insertion was performed