TimescaleDB 2.3 makes built-in columnar compression even better by enabling inserts directly into compressed hypertables, as well as automated compres

TimescaleDB 2.3: Improving columnar compression for time-series on PostgreSQL

submited by

Style Pass

2021-05-26 13:30:07

TimescaleDB 2.3 makes built-in columnar compression even better by enabling inserts directly into compressed hypertables, as well as automated compression policies on distributed hypertables.

Time-series data is relentless. In order to measure everything that matters, you need to ingest thousands, perhaps even millions, of data points per second. The more data you’re able to collect, the more in-depth your analysis is bound to be. Yet storing and analyzing this data does not come for free, and volume of time-series data can lead to significant storage cost.

TimescaleDB addressed this storage challenge through a novel approach to database compression. Rather than just compressing entire database pages using a single standard compression algorithm, TimescaleDB uses multiple best-in-class lossless compression algorithms (which it chooses automatically based on the type and cardinality of your data), along with a unique method to create hybrid row/columnar storage. This provides massive compression savings and speeds up common queries on older data.

We first introduced compression in TimescaleDB 1.5, amidst much excitement about how a row-oriented database could support columnar compression. Users were delighted with 92-96% compression rates on their time-series data (like jereze and Vitaliy), leading to both significant storage and cost savings for their databases, as well as query performance improvements. Our approach didn’t just save users storage (and thus money), but it made many queries faster!