WarpStream is an Apache Kafka® protocol-compatible data streaming system built directly on-top of object storage. That means WarpStream is designed to “look and feel” exactly like a “real” Kafka cluster, but under the hood it has a completely different implementation and makes a completely different set of architectural decisions.
The most important difference between WarpStream and Kafka is that WarpStream offloads all storage to object storage and requires no local disks. This makes WarpStream 5-10x cheaper for high volume workloads (because it eliminates inter-zone networking costs entirely), and also makes it trivial to operate since WarpStream’s single binary (called an Agent) is completely stateless.
This dichotomy presents a conundrum: WarpStream Agents are stateless, but the Kafka protocol assumes an incredibly stateful system. Somehow we have to square that circle. The rest of this post will cover exactly how we did that, going so far as to make WarpStream work with off the shelf Kafka clients even when it’s deployed behind a traditional network load balancer like Nginx, something that’s impossible to do with traditional Apache Kafka.
Let’s start by discussing how Kafka’s service discovery mechanism works from a client’s perspective. The client establishes a TCP connection to the provided “bootstrap” broker URL and sends an APIVersions request to negotiate what features of the Kafka protocol are supported by the broker.