>Pick a a stable, guaranteed-to-exist, shard key (composite or atomic properties...

AdieuToLogic · on April 10, 2023

> This is a pretty risky approach since it's almost certainly the case that you won't end up evenly distributing your data across shards using this method.

Distributing data across shards is a function of the properties selected to use for partitioning. So I do not understand how "a stable, guaranteed-to-exist, shard key (composite or atomic properties)" is "a pretty risky approach."

Nathanba · on April 7, 2023

more recent customers/users/accounts probably do more actions than very old accounts though, how is that not also eventually creating uneven shards?

AdieuToLogic · on April 10, 2023

While high volume multi-tenant "customers/users/accounts" systems are common, they are not the only ones which benefit from sharded persistent stores.

For example, consider a system which monitors farm equipment for Caterpillar and John Deere. Lets say each company has 100k devices which send one message per day to the system.

While it is easy to envision sharding device messages based on "DeviceId / Company" in this hypothetical system, there would be no value sharding the two customers.

preseinger · on April 9, 2023

you're right, uneven shards are an inevitable outcome of this approach

but shard "even-ness" is in direct tension with the concern of the GP, which is execution atomicity

frequently, it's better to have uneven shards (that you can e.g. scale independently when necessary) that give you atomic execution, than even shards that require distributed transactions