Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes - The BFT problem only matters when you have Byzantine actors. But I think users deserve and expect the system to be reasonably well behaved and predictable in all situations. Anything publically writable, for example, needs BFT resilience. Or any video game.

As for the prosemirror problem, I assume you’re talking about weird merges from users putting markdown in a text crdt? You’re totally right - this is a problem. Text CRDTs treat documents as a simple sequence of characters. And that confuses a lot of structured formats. For example, if two users concurrently bold the same word, the system should see that users agree that it should be bolded. But if that “bold” intent is translated into “insert double asterisks here and here”, you end up with 4 asterisks before and after the text, and that will confuse markdown parsers. The problem is that a text crdt doesn’t understand markdown. It only understands inserting items, and bolding something shouldn’t be treated as an insert.

JSON editing has similar problems. I’ve heard of plenty of people over the years putting json text into a text crdt, only to find that when concurrent edits happen, the json grows parse errors. Eg if two users concurrently insert “a” and “b” into an empty list. The result is [“a””b”] which can’t be parsed.

The answer to both of these problems is to use CRDTs which understand the shape of your data structure. Eg, use a json OT/crdt system for json data (like sharedb or automerge). Likewise, if the user is editing rich text in prosemirror then you want a rich text crdt like peritext. Rich text CRDTs add the concept of annotations - so if two users bold overlapping regions of text, the crdt understands that the result should be that the entire region is bolded. And that can be translated back to markdown if you want.

Luckily, in over a decade of work in the collaborative editing space, I haven’t found any other examples of this problem other than rich text and json. I think if a collaborative editing system supports json data, rich text and plain text then it’ll be enough to add collaborative editing support to 99% of applications.

The ink & switch people did a great write up of how rich text CRDTs work here: https://www.inkandswitch.com/peritext/



> Or any video game

Agreed, BFT is clearly needed for multiplayer CRDT backed video games.

> I assume you’re talking about weird merges from users putting markdown in a text crdt?

Nope, although that is an issue. In that case the document shouldn't be markdown, it should be a rich text CRDT that's converted to markdown as output.

On the conflicts I mentioned, an example, say you are building a to do list app. First let's do it with Prosemirror and Yjs, but for some reason we have decided to limit the length of a to do list to 10 items. Prosemirror will let you do that when defining a schema, have a maximum number of child nodes of the parent node type. With the current Yjs/Prosemirror system, if you have 9 items in the list and two people concurrently add a 10th, one of them will be dropped by prosemirror (deterministically). The document schema enforced that rule outside of the CRDT implementation. Yjs xmlFragments do not have the concept of these sort of rules.

Now say you want to do this with the json like Map and Array types. Again the array type does not have the concept of a length limit, it will merge the two concurrent edits and create a document with 11 entries. In this case your application needs to manage the no longer complying document to correct it.

The issue comes if you are naively merging the documents on the server, and dumping the json, it will not take into account your applications own conflict resolution. My suggestion is that a CRDT schema could do this, it would be a bit like a JSON Schema, but with rules about how to correct misshapen structures.

So yes, I agree these generic rich text plus JSON types cover what 99% of applications need, but they also need to enforce a shape to the data structure that isn't built into the generic types. Having a way to do that as part of the merging layer, rather than application layer, would help to ensure correctness.


Yeah I hear you. I've still never found a good way to implement schema validation rules like "this list cannot have more than 10 items" on top of a CRDT. Eventually consistent collaborative editing systems (like CRDTs and OT based systems) are optimistic. They allow any edits, and then later merge things if necessary. But if you only find out the list has 11 elements at the merging step, what do you do? Its too late to reject either of the inserts which put you in that state.

My best answer at the moment is to tell people to rethink validation rules like that. I can't think of a lot of good use cases for collaborative editing where a "length <= 10" rule is something you want.

Unfortunately, validation rules are really important for referential integrity. If you add a reference to some item in a data set and I concurrently delete that item, what should happen? Does the delete get undone? Does the reference get removed? Is the reference just invalid now? Should references only be to an item at some point in time? (Eg like a git SHA rather than a path)? Maybe optimistic systems just can't have referential integrity? Its an uncomfortable problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: