Distributed databases generally fall into two camps when it comes to architectures for maintaining high availability (HA) [1]. Both architectures are

Categorizing How Distributed Databases Utilize Consensus Algorithms

submited by
Style Pass
2024-10-15 14:00:03

Distributed databases generally fall into two camps when it comes to architectures for maintaining high availability (HA) [1]. Both architectures are used by many successful production systems, but each has very different trade-offs. I couldn’t find anyone who had studied these trade-offs, which is what motivated this article. I don’t know of standard/well-known names for the two HA approaches, so I’m going to refer to them as “Consensus for Metadata” (CfM) and “Consensus for Data” (CfD). As their names suggest, both designs use a consensus algorithm (Raft, Paxos, etc.) to handle failures and store data, but they differ in the type of data that is managed by consensus. To start, I’ll give an overview of the two architectures ignoring most implementation details since those differ in each database and don’t impact the architecture’s trade-offs for the most part. I’ll assume only that we are dealing with data that is sharded and has several copies of each shard spread across hosts in a distributed database.

Full disclosure, I’ve spent many years working on a CfM database (MemSQL/SingleStore), so I’m more familiar with this design. I’ve studied open source CfD databases along with their publicly available papers and documentation, but have much less personal experience with them. This may skew my opinion on this topic a bit.

Leave a Comment