PostHog uses ClickHouse to power our data analytics tooling and we've learned a lot about it over the years. The goal of this manual is to share that knowledge externally and raise the average level of ClickHouse understanding for people starting work with ClickHouse.
If you have extensive ClickHouse experience, and want to contribute thoughts or tips of your own, please do by opening an PR or issue on GitHub!
To solve this problem we looked at a wide range of OLAP solutions, including Pinot, Presto, Druid, TimescaleDB, CitusDB, and ClickHouse. Some of our team had used these tools before at other companies, such as Uber where Pinot and Presto are both used extensively.
ClickHouse was a good fit for all of these factors, so we started doing a more thorough investigation. We read up on benchmarks and researched the experience of companies such as Cloudflare that uses ClickHouse to process 6m requests per second. Eventually, we set up a test cluster to run our own benchmarks.
ClickHouse repeatedly performed an order of magnitude better than other tools we considered. We also discovered other perks, such as the fact that it is column-oriented and written in C++. We found these to be the key benefits of ClickHouse: