Papers with Friends: Katara - Synthesizing CRDTs with Verified Lifting

At Fission, we really like reading papers, and, sometimes, implementing them (check out Fission Reactor Community Call - Zoom). Though we already run the ongoing, monthly distributed systems reading group, we end up running into papers that just need to be read and discussed “in the now,” which may involve other topics like programming languages, cryptography, networking, or a concoction of many subjects. This is that kind of series.

​For this rendition, we’ll be reading and talking about the paper Katara: Synthesizing CRDTs with Verified Lifting by Laddad, et. al, which was accepted to OOPSLA 2022. We’ll also have authors Shadaj Laddad & Mae Milano introduce it and participate in the discussion!

​The abstract is as follows:

​> Conflict-free replicated data types (CRDTs) are a promising tool for designing scalable, coordination-free distributed systems. However, constructing correct CRDTs is difficult, posing a challenge for even seasoned developers. As a result, CRDT development is still largely the domain of academics, with new designs often awaiting peer review and a manual proof of correctness. In this paper, we present Katara, a program synthesis-based system that takes sequential data type implementations and automatically synthesizes verified CRDT designs from them. Key to this process is a new formal definition of CRDT correctness that combines a reference sequential type with a lightweight ordering constraint that resolves conflicts between non-commutative operations. Our process follows the tradition of work in verified lifting, including an encoding of correctness into SMT logic using synthesized inductive invariants and hand-crafted grammars for the CRDT state and runtime. Katara is able to automatically synthesize CRDTs for a wide variety of scenarios, from reproducing classic CRDTs to synthesizing novel designs based on specifications in existing literature. Crucially, our synthesized CRDTs are fully, automatically verified, eliminating entire classes of common errors and reducing the process of producing a new CRDT from a painstaking paper proof of correctness to a lightweight specification.

​You can also find an open source implementation of the work on Github.

A big thank you to our special guests and everyone who attended! Here’s the video and chat transcript:

Video

Chat Log

12:01:22 From Philipp Krüger to Everyone:
	Heyo! Someone DMd me asking whether I could make sure that this gets recorded
12:01:35 From quinn to Everyone:
	Hey all! Will be on video in a sec 🙂
12:02:32 From Philipp Krüger to Everyone:
	and Brian!
12:03:09 From Sodium  to Everyone:
	\o
12:03:09 From Hanli Theron (she/her) to Everyone:
	Hey everyone, hope you don’t mind me just being a fly on the wall here
12:03:29 From Brooklyn Zelenka (@expede) to Everyone:
	Totally fine Hanli!
12:03:58 From Philipp Krüger to Everyone:
	Oh I bet 60 people signed up for the SMT solver mentions :P
12:04:05 From Hanli Theron (she/her) to Everyone:
	Thanks!
12:04:08 From Sodium  to Everyone:
	porting the linux kernel to be entirely crdt-based
12:04:39 From quinn to Everyone:
	futex(2): 👎
12:04:45 From quinn to Everyone:
	crdt: 👍
12:05:44 From Marc-Antoine Parent to Everyone:
	Just got the first benchmark running!
12:06:15 From quinn to Everyone:
	Cliff notes: https://twitter.com/wilton_quinn/status/1581018653603946496
12:07:31 From Abhishek to Everyone:
	i somehow got here. can someone please send a link around a list of papers read so far ? maybe a website (or anything else) works.
12:08:10 From quinn to Everyone:
	@abhishek: https://lu.ma/distributed-systems
12:08:15 From Brooklyn Zelenka (@expede) to Everyone:
	Here's the whole description with a few links: https://lu.ma/bvu1cg8r
12:08:44 From Philipp Krüger to Everyone:
	This is the first "papers with friends" event, but we've previously run the distributed systems reading group, the link for that Quinn posted above
12:08:50 From Brooklyn Zelenka (@expede) to Everyone:
	This is technically the first of this series
12:09:15 From quinn to Everyone:
	I __loved__ the PACT paper!
12:09:18 From Brooklyn Zelenka (@expede) to Everyone:
	SAME
12:10:02 From Brendan O'Brien (@b5) to Everyone:
	someone have a link to the PACT paper?
12:10:02 From Zeeshan Lakhani to Everyone:
	https://www.youtube.com/watch?v=lY-_IKGvITU (vid) and new directions: https://arxiv.org/abs/2101.01159
12:10:13 From Brendan O'Brien (@b5) to Everyone:
	_thank you_
12:11:10 From Abhishek to Everyone:
	thanks! Quinn and brooklyn
12:13:03 From Philipp Krüger to Everyone:
	having connection issues. I'll try rejoining
12:13:57 From Mae Milano (she/her) to Everyone:
	the work also has a lot of motivations from the Gallifrey line: http://www.languagesforsyste.ms/publication/gallifrey/
12:14:19 From Eleanor to Everyone:
	love that reason
12:14:24 From Mae Milano (she/her) to Everyone:
	:)
12:18:53 From Russell Brown to Everyone:
	ahahaha
12:20:35 From Marc-Antoine Parent to Everyone:
	If it’s even done at all…
12:26:14 From Zeeshan Lakhani to Everyone:
	https://github.com/metalift/metalift
12:27:18 From quinn to Everyone:
	The Dark Souls of research projects
12:28:25 From quinn to Everyone:
	I've got endless questions, but I don't want to monopolize the discussion, so I'll let other people go first. If there's a lull, I'm here though 😛
12:29:14 From Mae Milano (she/her) to Everyone:
	Ok!!! you can also poke us on Twitter more after if you want :)
12:29:22 From quinn to Everyone:
	Sounds good!
12:29:35 From Eleanor to Everyone:
	i would love to hear about the convergence problems of CRDTs but I don't have a specific question yet
12:30:06 From Brooklyn Zelenka (@expede) to Everyone:
	I sadly have to run (got double booked). I'll watch the video after. Super exciting work! Thanks all 👋
12:31:06 From Brendan O'Brien (@b5) to Everyone:
	I too have a meeting, but this is very exciting! Thanks to both Mae & Shadaj for coming to speak, and the fission team for organizing!
12:34:44 From Marc-Antoine Parent to Everyone:
	Curious how those deltas would map to op-based credit?
12:41:10 From Sodium  to Everyone:
	i love "just give *everything* a unique ID" as a way to solve the "what if you have new nodes" problem
12:42:41 From Marc-Antoine Parent to Everyone:
	I agree we don’t want to wait for full history, but would the same reasoning apply to events pointing to partial (locally known) history?
12:50:10 From Mae Milano (she/her) to Everyone:
	for other approaches to the "Detect idempotence" question, Check our Indigo from Marc Shapiro's group and Datafun by Mike Arntzenius!
12:51:19 From Zeeshan Lakhani to Everyone:
	Good call out to datafun as well!
12:53:59 From Russell Brown to Everyone:
	and yet we do…in ad hoc meshes
12:54:25 From quinn to Everyone:
	I've never heard it described that way, but is the issue you're talking about essentially why CRDTs like RGA end up being so complicated here?
12:58:31 From quinn to Everyone:
	I think maps are mentioned
13:00:22 From quinn to Everyone:
	I had a very fun time tracing through the code. Metalift is extremely cool
13:00:22 From Sodium  to Everyone:
	The Computer Wrangling portion of the event :P
13:00:27 From Melanie Riise to Everyone:
	ooo that sounds very nice
13:01:23 From Mae Milano (she/her) to Everyone:
	summary to Russel's question: our technique works for maps! but the underlying solver doesn't.  This is a relatively orthogonal issue to the Katara core results :)
13:02:14 From quinn to Everyone:
	Shaddaj — no pressure if you need to go, but do you mind if I ask a few questions about HydroLogic?
13:03:10 From Sodium  to Everyone:
	thanks for coming! <3
13:03:44 From quinn to Everyone:
	Thank you so much for joining, and really amazing work with all of this!
13:04:53 From quinn to Everyone:
	https://hydro-project.github.io/hydroflow/design_docs/2022-02_time_strata_design_doc.html
13:05:07 From Russell Brown to Everyone:
	thanks for the meetup. I'll be using katara to generate _models_ and compare implementations at first I think. Great work!
13:09:06 From Sodium  to Everyone:
	https://github.com/vmware/database-stream-processor
13:11:25 From Scott Fritchie to Everyone:
	I really enjoyed working with Leonid & Mihai, fun to see they're big contributors to that dbsp repo
13:11:47 From Zeeshan Lakhani to Everyone:
	Mihai has done like everything (networking programming, etc)
2 Likes

@Quinn’s fabulous thread on the paper: https://twitter.com/wilton_quinn/status/1581018653603946496?s=20&t=nV74pfGpmfxDurcsTIYIWg

1 Like