Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because distributing a database is sensitive to cap semantics (AP or CP) and data dependencies (graph partitioning is hard *) and storage engine choices are driven by use-cases and the general technical solution is thus highly complex**. Spanner uses atomic clocks, for example. Running CockroachDB yourself is [very likely] not the same thing as using a saas varient, either. Sight unseen, it can not be 'trivial'. Same for Spanner. The general solution seems to require paying someone to provide the service. In sum, it is not a clear cut yes/no situation.

btw, [distributed] Postgres iirc was never as stellar the single node (the stuff we sing praises of) vs the distributed deployment. I'm sure it has improved significantly.

> "manual operations required to reshard or rebalance from time to time. When it's part of the database itself, all those problems just... disappear."

Not really correct.

* "Choosing the right keys can help Spanner evenly distribute data and processing to avoid hotspots"

https://cloud.google.com/spanner/docs/schema-design

https://cloud.google.com/blog/topics/developers-practitioner...

** https://static.googleusercontent.com/media/research.google.c...

[Spanner certainly did -not- start off as a distributed RDBMS. Because that project would have never been given a green light. Because it is understood just how complex that system would need to be. It started off as a distributed k/v. That's it.]

"[I]n many ways today’s Spanner is very different from what was described [in original Spanner whitepaper]"

...

"The initial focus of Spanner was on scalability and fault-tolerance, much as was the case for many systems at Google. In the last few years Spanner has increasingly become a fully-fledged database system. Production deployment with internal customers taught us a great deal about the requirements of web scale database applications, and shaped the techniques we presented in this paper. Aggressive focus on horizontal scalability enabled widespread deployment without investing too heavily in single machine performance. The latter is one of the areas where we are making continuous improvements, such as upgrading lower level storage to Ressi."

"The original API of Spanner provided NoSQL methods for point lookups and range scans of individual and interleaved tables. While NoSQL methods provided a simple path to launching Spanner, and continue to be useful in simple retrieval scenarios, SQL has provided significant additional value in expressing more complex data access patterns and pushing computation to the data."



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: