I don't need to be convinced. As I said, I'm very, very interested in CRDTs. (And to guide my interest, of course I'm heavily relying on Will and other folks who are far more experienced in this domain than myself.)
If we introduce reusable conflict-free mergeable data structures, then ideally I'd like them to adopt as many of the conventions established by collection types in the stdlib as practical, while also making them as space/time efficient as possible. (For example, I have this silly idea lodged very firmly in my head that diffing/merging algorithms for these things ought to be ~linear in the size of the diff, not the size of the data structure.) Achieving all this while also allowing for robust serialization is an interesting challenge. (I have some promising ideas, but at this stage they are far too delicate to withstand the heat of this forum.)
Re: the impossibility of automated merging: I believe that fully reliable merging simply isn't possible without a deep understanding of not only the meaning of the data, but also the motivation behind the changes being merged. E.g., given this situation:
- Original: Bob tagged a new release.
- Change A:
BobAlice tagged a new release. - Change B: Bob tagged a new release, then he went on vacation.
I expect a conflict-free mergeable string type would merge these two changes as "Alice tagged a new release, then he went on vacation" -- which is a reasonable outcome, but it's almost certainly not going to be the "correct" one. No algorithm (save some sort of strong AI) will ever be able to understand our documents well enough to merge parallel changes without supervision. (The correct outcome might be "Alice tagged a new release, then she continued working on the next one", or something completely different. It probably involves asking Alice about it.)
CRDTs err on the side of (apparent) simplicity and predictability, which I think is a good idea. It's a technology that obviously works well enough for a great many use cases. But there is still plenty of room for other approaches! For example, I am not holding my breath for CRDTs to replace version control systems in software engineering. (Although they are encroaching on some of their territory.)
I think so! We're extremely lucky to have Michael also working on their Swift implementation.
(I'm hoping to land his PR soon and then apply some stdlib engineering tricks to make it really scream.)
Indeed. What I meant is that I believe that persistent data structures make a far better building block for any "mergeable" data structure than the stdlib's current contiguous collections. (The existing collections are optimizing for read only access and in-place mutations of uniquely held storage. This makes perfect sense, but this particular use case is all about mutations of shared copies.)
CRDTs tend to accumulate oodles of data over their lifetime, and I expect that alone will cause plenty of trouble to deal with. All-or-nothing copy-on-write implementations would just add a needless exponent on top of these troubles.
We are very interested in CRDTs. I'm hopeful our explorations in this space will bear fruit, but it's too early to tell if it will taste sweet enough!