Delta-delta encoding, Simple-8b, XOR-based compression, and more - These algorithms aren't magic, but combined they can save over 90% of storage costs

Time-series compression algorithms, explained

submited by

Style Pass

2022-05-14 16:30:14

Delta-delta encoding, Simple-8b, XOR-based compression, and more - These algorithms aren't magic, but combined they can save over 90% of storage costs and speed up queries. Here’s how they work.

Computing is based on a simple concept: the binary representation of information. And as computing infrastructure has gotten cheaper and more powerful, we have asked it to represent more and more of our information, in the form of data we collect (which is often time-series data).

But computing is not free. The more efficiently we can represent that information, the more we can save on storage, compute, and bandwidth. Enter compression: “the process of encoding information using fewer bits than the original representation.” (source)

Compression has played an important role in computing for several decades. As a concept, compression is even older: “Morse code, invented in 1838, is the earliest instance of data compression in that the most common letters in the English language such as “e” and “t” are given shorter Morse codes.” (source)

In this post, we set out to demystify compression. To do this, we explain how several lossless time-series compression algorithms work, and how you can apply them to your own projects.

cbloom rants: How Oodle Kraken and Oodle Texture supercharge the IO system of the Sony PS5

Comment

Timescale grabs $40M Series B as it goes all in on cloud version of time series database

Comment

shachaf on Twitter: "Here are some algorithms I like that can be explained in 15 minutes, aren't part of baseline CS knowledge, and do something surprising. What else should be on this list?"

Comment

Minimizing Mean Time to Detect: Real Time Alarms with IREE

Comment

It's time to decentralize the internet, again: What was distributed is now centralized by Google, Facebook, etc

Comment

CXXGraph is a Header-Only C++ Library for Graph Representation and Algorithms

Comment

state.eth on Twitter: "there is one ether on this address https://t.co/VHrKh1scUH the private key of this address was used to generate this fidenza using the fidenza algorithm https://t.co/2Hnqx8FVxJ feel free to reverse engineer this and claim the ether wanna solve this, anon?… https://t.co/eBJEytZJcg"

Comment

Greykite: A flexible, intuitive, and fast forecasting library

Comment

If You’re So Successful, Why Are You Still Working 70 Hours a Week?

Comment

mwlon / quantile-compression

Comment

Time-series compression algorithms, explained

Leave a Comment

Related Posts

cbloom rants: How Oodle Kraken and Oodle Texture supercharge the IO system of the Sony PS5

Timescale grabs $40M Series B as it goes all in on cloud version of time series database

shachaf on Twitter: "Here are some algorithms I like that can be explained in 15 minutes, aren't part of baseline CS knowledge, and do something surprising. What else should be on this list?"

Minimizing Mean Time to Detect: Real Time Alarms with IREE

It's time to decentralize the internet, again: What was distributed is now centralized by Google, Facebook, etc

CXXGraph is a Header-Only C++ Library for Graph Representation and Algorithms

Greykite: A flexible, intuitive, and fast forecasting library

If You’re So Successful, Why Are You Still Working 70 Hours a Week?

mwlon / quantile-compression

Recent Posts

Tesla’s 2 million car Autopilot recall is now under federal scrutiny

Lost opportunity: We could’ve started fighting climate change in 1971

Hidden Gems of Tailwind CSS

'Grading for Equity': Promoting Students by Banning Grades of Zero and Leaving No Class Cut-Ups Behind

Cybersecurity firm Darktrace agrees $5.3bn sale to US private equity business

Inside the Brutal Business Practices of Amazon—And How It Became “Too Toxic to Touch”

How Big Tech and Silicon Valley are Transforming the Military-Industrial Complex

Whither Serverless Compute? or Why the Cloudflare-PartyKit Acquisition Matters

Data Breach at Kaiser Permanente Affects 13.4 Million People

European Commission: "Our commitment to the fediverse is here to stay. …" - EU Voice

DJI might get banned next in the US

GQL: The ISO standard for graphs has arrived

School principal was framed using AI-generated racist rant, police say. A co-worker is now charged.

How to use git to lose data

On community in Nix

Windows 11 will reportedly display a watermark if your PC does not support AI requirements

Thomas Wouters: "It's a tough day when everyone you work with dire…" - social.coop

Semiconductor Giant ASML Has a New Boss, and a Big Problem

Search code, repositories, users, issues, pull requests...

Tesla’s Autopilot and Full Self-Driving linked to hundreds of crashes, dozens of deaths