At PeerDB, we provide a fast and cost-effective way to replicate data from Postgres to Data Warehouses such as Snowflake, BigQuery, ClickHouse, and qu

Simple Postgres to ClickHouse replication featuring MinIO

submited by
Style Pass
2024-05-02 19:30:02

At PeerDB, we provide a fast and cost-effective way to replicate data from Postgres to Data Warehouses such as Snowflake, BigQuery, ClickHouse, and queues like Kafka, Red Panda and Google PubSub, among others.

A few months ago, we added a ClickHouse connector for Postgres Change Data Capture (CDC). Surprisingly, this connector gained substantial traction and adoption within our community. This applies to both our fully managed service (PeerDB Cloud) and our Open Source offerings. Here is a customer story from one of our customers who uses the ClickHouse connector.

However, there was one common piece of feedback from many of our Open Source users. The ClickHouse connector required an S3 bucket as a prerequisite, which added additional overhead for users. Non-AWS users and those without immediate access to S3 could not use the ClickHouse connector. This wasn't a problem in our fully managed offering (PeerDB Cloud), as we abstracted away the S3 bucket creation from our customers.

This blog describes how we solved this problem and made it extremely easy for our users replicating data from Postgres to ClickHouse. We used MinIO, the open source S3 alternative, to stage the intermediary Avro files as part of the Change Data Capture (CDC) from Postgres to ClickHouse.

Leave a Comment