Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

RethinkDB was good from a distributed systems perspective, but a nightmare to maintain in production. Backups and restores would take 12+ hours on ten gigabytes or so and slow queries would grind the whole system to a halt, which isn't possible in Erlang based systems like Riak due to the preemptive scheduling of the BEAM.


Riak builds on theoretical foundation, laid out in Amazon's Dynamo paper. The other NoSQLs did not have this theoretical underpinning, and so they just offer a "best effort".


Sure, but that said, no other DB Jepsen tested til that point necessitated the kind of gymnastics he had to do to get it to fail. It's pretty solid CS, and it's a shame the project had the end of life that it did.

[0] https://aphyr.com/posts/330-jepsen-rethinkdb-2-2-3-reconfigu...


Dynamo is not exactly a performant or efficient model. It's the equivalent of pulling all the distributed systems guts out and handing them to the user to deal with. And the resulting toll is quantifiable: http://damienkatz.net/2013/05/dynamo_sure_works_hard.html


Damien's a very smart guy, but I don't think I agree with him here:

> Within a datacenter, the Mean Time To Failure (MTTF) for a network switch is one to two orders of magnitude higher than servers, depending on the quality of the switch.

A switch is highly unlikely to fail. They seem to be bulletproof. But having worked with a datacenter (on the engineering team of an early AWS competitor), switch _misconfiguration_ was all too common. Maybe a tech accidentally plugs in the wrong ethernet cable and forms a switching loop. Maybe someone fat-fingers a tag and a broken VLAN gets automatically deployed to 10,000 nodes. Either way, the _switch_ is alive, well, and pushing packets - but they're the _wrong_ packets and the result is indistinguishable from hardware failure to the end user.

At datacenter scales, these things happen... not infrequently. If you engineer your database to expect that netsplits are rare, you're going to have a bad time.


VLANs were the bane of my existence when I had to figure out how to deal with them. I don't envy anyone whose job is to manage them on switches for a lot of servers.


Good points. Weren't the last couple of AWS outages partly due to misconfigured network configs? Depending on your problem the replicated reads makes sense given those kinda of outages. Though I'm new-ish to riak's "core" design, when you get the APL with the servers, isn't it feasible to create a design similar to what Damien's proposing using a preferred master for a given vnode?


Cassandra was based on the same thing, no? In my experience Cassandra has been what's beaten Riak.


If I remember correctly, Cassandra is actually an ideological frankenstein from pieces of BigTable & Dynamo

EDIT: I don't mean to disparage it, just that it doesn't come as purely from one direction as Riak. It certainly appears to have won


I prefer to think of it as the mullet of the database world: Bigtable in the front, Dynamo in the back.


Vice versa, if network is front and disk is back (as it is in the code).


Are those times for real? You could snapshot the underlying disks and copy them byte for byte orders of magnitude faster than that.


The problem was replication across the cluster and getting them to coordinate their values. We eventually did what you suggested in our dev environments so that we didn't lose our sanity. .




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: