The Simple Magic of Consistent Hashing

submited by

Style Pass

2024-09-22 11:30:04

The simplicity of consistent hashing is pretty mind-blowing. Here you have a number of nodes in a cluster of databases, or in a cluster of web caches. How do you figure out where the data for a particular key goes in that cluster?

You apply a hash function to the key. That’s it? Yeah, that’s the whole deal of consistent hashing. It’s in the name, isn’t it?

The same key will always return the same hash code (hopefully), so once you’ve figured out how you spread out a range of keys across the nodes available, you can always find the right node by looking at the hash code for a key.

It’s pretty ingenious, if you ask me. It was cooked up in the lab chambers at Akamai, back in the late nineties. You should go and read the original paper right after we’re done here.

Consistent hashing solves the problem people desperately tried to apply sharding to pretty nicely and elegantly. I’m not going to bore you with the details on how exactly consistent hashing works. Mike Perham does a pretty good job at that already, and there are many more blog posts explaining implementations and theory behind it. Also, that little upcoming book of mine has a full-length explanation too. Here’s a graphic showing the basic idea of consistent hashing, courtesy of Basho.