Goal: spread data and query load evenly across nodes (avoid skew and hot spots)
If you randomly assigned data, it would be evenly distributed, but you’d have to do parallel reads of all nodes for every query (no determinism for which data is on which partition(s))
Partitioning by Key Range
Partitioning by Hash of Key
Used not for finding particular values (key → value), but for finding occurrences of things (e.g. all actions by user 123)
Secondary indexes super important for relational & document dbs, but don’t map cleanly to partitioning
Partitioning Secondary Indexes by Document
Partitioning Secondary Indexes by Term
Do not do hash mod N - if # nodes (N) changes, all hash-mod-N results change! Makes rebalancing unnecessarily expensive
Fixed # partitions - more than you need, then spread them out over more partitions as N increases
Dynamic partitioning
Partitioning proportional to nodes
General problem is “service discovery”. In this case, it’s “if I want to read “key:foo”, which partition(s) do I go to”?
Approaches:
Key issue: how does the thing knowledgable of node-partition assignments (nodes, routing tier, or clients) know about changes?
ZooKeeper is one system that can keep authoritative tracking of nodes → partitions
Can also use a “gossip protocol” to disseminate changes in cluster state