Delta-delta encoding, Simple-8b, XOR-based compression, and more - These algorithms aren't magic, but combined they can save over 90% of storage costs and speed up queries. Here’s how they work.
Computing is based on a simple concept: the binary representation of information. And as computing infrastructure has gotten cheaper and more powerful, we have asked it to represent more and more of our information, in the form of data we collect (which is often time-series data).
But computing is not free. The more efficiently we can represent that information, the more we can save on storage, compute, and bandwidth. Enter compression: “the process of encoding information using fewer bits than the original representation.” (source)
Compression has played an important role in computing for several decades. As a concept, compression is even older: “Morse code, invented in 1838, is the earliest instance of data compression in that the most common letters in the English language such as “e” and “t” are given shorter Morse codes.” (source)
In this post, we set out to demystify compression. To do this, we explain how several lossless time-series compression algorithms work, and how you can apply them to your own projects.