In our recent blog article, Integrating ClickHouse with MinIO, we introduced certified support for integrating ClickHouse‘s disk storage system

ClickHouse Object Storage Performance: MinIO vs. AWS S3

submited by
Style Pass
2021-06-29 16:30:09

In our recent blog article, Integrating ClickHouse with MinIO, we introduced certified support for integrating ClickHouse‘s disk storage system and S3 table function with MinIO. Now that ClickHouse fully supports both AWS S3 and MinIO as S3-compatible object storage services, we will compare the performance of AWS S3 and MinIO when used to store table data from two of our standard datasets. We will be working with the OnTime dataset, which contains almost two hundred million rows of airline flight data, and the New York Taxi dataset, which contains just over 1.3 billion rows of New York taxi ride data. Instructions to download the datasets can be found at the links above.

We would like thank our partner MinIO, Inc., for providing the Kubernetes lab environment as well as engineering advice to enable MinIO performance tests.

To run our performance benchmarks on AWS S3, we start with the latest version of ClickHouse (21.5.5) running inside Kubernetes on an Amazon m5.8xlarge EC2 instance. This is a mid-range instance with 32 vCPUs, 128GB of RAM and EBS gp2 storage. The EC2 instance is located in US-East-1, the same location as the AWS S3 storage bucket we will be using. This will minimize latency while querying table data stored in S3.

Leave a Comment