TlDR: Concurrency and pre-fetching gives zio-kafka a higher consumer throughput than the default java Kafka client for most workloads.
Zio-kafka is an asynchronous Kafka client based on the ZIO async runtime. It allows you to consume from Kafka with a ZStream per partition. ZStreams are streams on steroids; they have all the stream operators known from e.g. RX-Java and Monix, and then some. ZStreams are supported by ZIO's super efficient runtime.
So the claim is that zio-kafka consumes faster than the regular Java client. But zio-kafka wraps the default java Kafka client. So how can it be faster? The trick is that with zio-kafka, processing (meaning your code) runs in parallel with the code that pulls records from the Kafka broker.
Obviously, there is some overhead in distributing the received records to the streams that need to process it. So the question then is, when is the extra overhead of zio-kafka less than the gain by parallel processing? Can we estimate this? In turns out we can! For this estimate we use the benchmarks that zio-kafka runs from GitHub. In particular, we look at these 2 benchmarks with runs from 2024-11-30:
First we calculate the overhead of java-kafka. For this we assume that counting the number of records takes no time at all. This is reasonable, a count operation is just a few CPU instructions, nothing compared to fetching something over the network, even if it is on the same machine. Therefore, in the figure, the 'processing' rectangle collapses to a thick vertical line.