Project Cambria and Building the new OpenDoc

Geoffrey Litt (PhD student at MIT) @geoffreylitt and Peter van Hardenberg (researcher at Ink and Switch) @pvh will be presenting about Cambria.

Cambria is an experimental Typescript library for effectively managing schema change in distributed systems. It aims to allow developers to express relationships between schemas using bidirectional lenses, and to avoid mixing compatibility code into application logic.

The Project Cambria article https://inkandswitch.github.io/cambria/ is already here in the forum and is important back ground reading:

Plus other #ink-and-switch work.

Video

Slides on Pitch:

Resources

Github: https://github.com/inkandswitch/cambria

Local links to #crdt related content

Via @maparent, Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski. Conflict-free Replicated Data Types. [Research Report] RR-7687, 2011, pp.18. ffinria-00609399v1f https://hal.inria.fr/inria-00609399v1/document

Chat Log
Time User Message
00:35:53 Marc-Antoine Parent: asymptomatically typeless :wink:
00:36:20 Marc-Antoine Parent: asymptotically, sorry
00:37:56 Jess Martin: reminds me of this article on software longevity called “100 year computer”: The 100 Year Computer - Tales From The Dork Web
00:38:29 Jess Martin: without schema migration support, software “ossifies” over time, which makes it really hard to write software that can run for 100 years without needing to be completely rewritten from scratch
00:38:36 Jess Martin: makes it hard to build “evolutionary” systems
00:39:11 Boris Mann: I see breaking API changes as being evolutionary pressure
00:39:31 Peter: the name “cambria” is no accident
00:39:35 Boris Mann: SHOULD we break APIs more often? If no one screams, no one cares. If a client breaks, it means it doesn’t have sustainable maintenance.
00:39:59 Boris Mann: Mainly this sucks for users who aren’t devs.
00:41:37 Brent Anderson: Having options to lift this above the storage layer is so important for stateless systems. Is that one of the key insights to Cambria?
00:42:24 Brian Cloutier: Maybe we should break APIs more often but it’s tantamount to abandoning maintainerless software
00:42:41 Peter: shout out to jim pick on the call here who helped us explore an early version of this
00:42:53 Jim Pick: /me waves
00:43:34 Jess Martin: Mann: feels like using a blowtorch to light a candle. :slight_smile: I agree that schema migrations are a signal that evolution is happening! But if preserving forwards/backwards compatibility is possible and easy, why shouldn’t we?
00:43:51 Benjamin Bollen: are there constraints on this graph of lenses? eg. a global directional flow / foliation
00:44:32 Brooklyn Zelenka: They’re bidirectional, so the arrows always point both ways
00:44:54 Boris Mann: @J Martin: yes, for sure. I’m thinking a lot about people supporting maintainers in a more wholistic way these days. I am all for using approaches like this to contain breakage.
00:44:58 Benjamin Bollen: right, but it could be circular / contain loops
00:45:01 Brooklyn Zelenka: It’s also a graph, so it’s not “just” server, you can get very forked schemata
00:45:05 Peter: loops are fine
00:45:07 Brooklyn Zelenka: Yeah, totally
00:45:12 Peter: we do a Dijkstra search to find a path
00:45:22 Benjamin Bollen: nice
00:45:39 Peter: Geoffrey will, I’m sure return to this, but not every migration is necessarily reversible. You will still need an escape hatch now and again.
00:45:42 Brent Anderson: Somewhat out of scope with schema migration: What about CRDT compaction? If we’re accumulating a pile of CRDT deltas and sharing them between peers, what options exist for preventing the massive accumulation of past events?
00:45:52 Peter: Let’s save that for the end, Brent
00:46:00 Jess Martin: “YAML. we don’t love it, either.” Heard that one before. :wink:
00:46:07 James Conkling: anyone have a good CRDT for Dummies resource?
00:46:30 Peter: happy to have a CRDT sidebar at the end
00:46:38 Boris Mann: @James: that’s a good thing to look for — I think we have some things at the “papers” level in the Fission forum, but would be good to have more resources and samples.
00:46:39 Marc-Antoine Parent: I got a lot out of this: https://hal.inria.fr/inria-00609399v1/document
00:47:43 James Conkling: I’m going to have to drop early, unfortunately, so won’t be able to join a sidebar at the end. But will definitely follow up on the above resources. Thanks all!
00:47:55 Jess Martin: hmmm… this really goes beyond compatibility. this makes possible different representations of core data.
00:48:20 Peter: check out automerge (our crdt), y-js (the other best crdt out there). also crdt.tech by martin kleppmann
00:48:23 Jess Martin: can represent the data in the user-facing application as it most naturally should be represented as appropriate to context :thinking:
00:48:25 Boris Mann: @James if you registered, I’ll send out email with a link to the video and any resources we gather
00:48:37 karlicoss: Maybe a bit of an overkill at this point, but perhaps dependent types or some other kind of formal verification could be used to verify that the lenses are reversible (I guess the most important properly?). This might allow imposing less structure and assumptions on lenses
00:49:45 karlicoss: s/properly/property
00:49:55 Jess Martin: doesn’t go through the “migration stack” like in Stripe impl
00:50:10 Jim Pick: might be a fun tool for mapping schemas between proprietary silos for cross-silo sync tooling
00:50:12 Steven: Amazing :clap:
00:50:31 Brian Cloutier: For the push/pull distinction, it feels possible to extend this to work over more than JSON objects. You might write a lense which transforms one sequence of RPC calls into another sequence.
00:51:08 Marc-Antoine Parent: it had schemas but not a good schema migration story
00:51:24 Jess Martin: someone should capture a quote
00:51:25 James Conkling: @Boris Thanks! Pretty sure I registered. If not:
00:52:58 Dylan Steck: Peter, that was key — a key to semantic web technologies being using different schemas to relate to other people’s schema. It’s a system like this that is going to help “un-silo” data and build out some of the core linked data concepts without cloning the Internet(like Blockstack)
00:54:41 Idan Gazit: Yes, Kafka, the datastore known for its administrative ergonomics
00:55:23 Idan Gazit: This sounds an awful lot like “Store UTC, convert at displaytime"
00:55:30 Peter: absolutely!
00:55:33 Idan Gazit: (In a good way!)
01:03:39 Boris Mann: “Nice squiggles”
01:03:45 Boris Mann: Well, that’s going to be a t-shirt
01:06:34 Brent Anderson: So are lenses always associative, or is it possible for a lens to be commutative?
01:06:58 James Conkling: how do you get loops in the schema graph?
01:07:08 Idan Gazit: “PRs not mergeable without tests, docs, lenses”
01:07:15 Peter: the lenses are not directional
01:07:18 Helder S Ribeiro: So how dows the schema for representing the lens graph evolve?
01:07:31 Boris Mann: Idan: yaaas
01:07:33 Peter: haha, so we tried writing lenses for that because we changed the format at one point
01:07:49 Peter: in brief, writing lenses for your lenses is possible but Lovecraftian to try and read and think about
01:08:11 Peter: impossible geometries. unthinkable complexities. horrors that cause the human mind to shrink in terror.
01:08:15 Helder S Ribeiro: “who lenses the lensmen”
01:08:35 Peter: also, more practically, we were missing a couple of lenses needed to do it and didn’t get to them during project implementation
01:08:35 Benjamin Bollen: have you considered “versions” or “names of nodes” to be derived from the type
01:08:51 Marc-Antoine Parent: morphism->functor->natural transformation… same idea, easy, right?
01:08:57 Antranig Basman: Sadly, the world requires us to think the unthinkable
01:09:23 Peter: fun fact, Haskell lenses were inspired by the academic research that we built on
01:10:01 Marc-Antoine Parent: Yes, had met Pierce’s lens through Unison ages ago!
01:10:23 Dillon Kearns: Relevant talk: Evergreen Elm "Evergreen elm" by Mario Rogic - YouTube. Uses the same concept of migrating data. And it’s being used in production in https://lamdera.app/ (announcement talk: Mario Rogic - Elm as a Service - YouTube)
01:13:56 James Conkling: I’ve got to drop. Thanks Geoffrey and Peter and all. This is giving me a lot to chew on.
01:14:40 Jess Martin: also, knowledge representation.
01:14:45 Jess Martin: we suck at that, too
01:15:35 Jess Martin: mapping across applications is the more exciting application! or mapping across use cases/views. we can represent knowledge in a stronger form, then allow for manipulating and interacting with the knowledge using appropriate “lenses” on the data
01:15:38 Idan Gazit: Apologies! I also have to drop, right in the middle of a proper diatribe that I’m 100% into
01:15:41 Jess Martin: separate the tools from the data
01:16:06 Marc-Antoine Parent: Let’s face it, interop is opposed by business models also. But I agree, some of us want interop.
01:16:29 Jess Martin: here’s that PVH quote as a tweet: https://twitter.com/jessmartin/status/1359556952322408451
01:16:33 Jess Martin: :wink:
01:17:12 karlicoss: Hopefully with Cambria it would be possible to adversarily interop with such businesses :slight_smile:
01:17:37 Helder S Ribeiro: amen to that
01:17:59 Marc-Antoine Parent: indeed
01:18:32 Helder S Ribeiro: advop all the things
01:20:57 Brooklyn Zelenka: FWIW we’ve been exploring lenses + datalog (early work, but still)
01:21:08 David K: Nice
01:23:02 Jess Martin: On Prolog: “rummage through that corpse in the alley and steal anything we can use”
01:23:32 Boris Mann: J Martin: welcome to the harvest PVH quotes channel!
01:24:11 Gyuri Lajos: https://twitter.com/TrailHub1/status/1359499797007327232
01:26:04 David K: :wave:
01:27:12 Tanishq Kancharla: Do people want to drop their twitters? Mine is @moonriseTK.
01:27:36 Jess Martin: something like general knowledge vs context-specific knowledge
01:28:12 Boris Mann: Good idea! Please do — I’m @bmann, @FissionCodes is the company account
01:28:23 Marc-Antoine Parent: @maparent
01:28:29 Brent Anderson: @brentjanderson or @bja@mastodon.social
01:28:29 Marc-Antoine Parent: sorry @ma_parent
01:28:45 Antranig Basman: I’m sure saw a “raise hand” control in other Zoom meetings but I don’t seem to see one here : P
01:28:45 Boris Mann: @bmann@social.coop on Masto, good call Brent!
01:28:48 Gyuri Lajos: @TrailHub1
01:28:58 Brian Cloutier: @bmc_
01:29:08 Ian: @ianopolous is me
01:29:36 Boris Mann: @Antranig — under Reactions at bottom of your screen
01:29:36 karlicoss: @karlicoss
01:30:08 Brian Cloutier: I really hope the pvh rant was recorded, that was some inspirational stuff
01:31:04 Boris Mann: Who can do me a dubstep remix of Peter Rant?
01:31:19 Boris Mann: I’m also down with a Sea Shanty version
01:31:27 James Walker: I was gonna say… shouldn’t it be a sea shanty ?
01:32:15 Helder S Ribeiro: (hands waving, doing buzzword arithmetic) would lens graphs and ipld relate in some useful way?
01:32:18 Brian Cloutier: The Cambriaman
01:33:20 Jess Martin: digging Antranig’s office lighting vibe. very film noir.
01:35:06 Antranig Basman: It’s just the evening : P
01:35:27 Peter: @helder: possibly. We’ve talked to them a bit – I think they compose well
01:35:35 Marc-Antoine Parent: Q: am I right that you built the schema transformation language while assuming the schema language (JsonSchema, I guess…)
01:37:09 Benjamin Bollen: right, so now it’s a DAG of bidirectional lenses (which still have primary direction ‘source - target’)
01:37:16 Benjamin Bollen: so there is a foliation
01:37:23 Antranig Basman: It is actually an edit transformation language rather than a schema transformation language
01:37:45 Boris Mann: Topics tagged crdt
4 Likes