The Schema Registry API is not how you use schema with Kafka!

submited by
Style Pass
2024-10-31 10:00:06

The Schema Registry — kindly open-sourced by Confluent — has a lovely, simple REST API. So when I sat down to learn how to work with schema in Apache Kafka, I was expecting a satisfying afternoon of quick progress on interesting code.

A day later, I was disappointed and lost. Since I’m in the business of selling a service for Apache Kafka, this is very embarrassing for me.

I remained in this state until it dawned on me that: “using schema with Kafka,” is not the same as “using the schema registry.” It’s actually using the KafkaAvroSerializer.java and KafkaAvroDeserializer.java.

Those two have a completely different API and, more importantly, intent than the schema registry. The schema registry simply stores schema. But KafkaAvroSerializer and KakfaAvroDeserializer implement a communication protocol. This protocol does two things: it helps consumers parse serialized structured records sent by the producers and it prevents producers from harming producers with changes to record schema.

Working out this protocol with nothing but the schema registry REST API reference in hand, therefore, is a fool’s errand. This was not obvious to me, I think, because so much is written about the schema registry itself but so little about the protocols used by the serialization clients. So I imagine there are many of us set out on the same fool’s errand I just completed. Hence this post.

Leave a Comment