Your organization is growing with each passing day; so is your data. More data brings more business opportunities, but it also begets higher storage c

Managing Your Data Lifecycle with Time to Live Tables | PingCAP

submited by
Style Pass
2021-06-15 03:30:06

Your organization is growing with each passing day; so is your data. More data brings more business opportunities, but it also begets higher storage costs. You want a better way to manage the cost? We want the same thing for our open source database, TiDB.

TiDB is a distributed SQL database designed for massive data. Our goal is to support large-scale datasets with a reasonable cost. At TiDB Hackathon 2020, we took a big step in that direction. We introduced a feature, the time to live (TTL) table, that enables TiDB to automatically manage the lifecycle of data according to its lifetime. TiDB makes sure every portion of its resources is consumed by high-value, fresh data.

In this article, I'll describe the TTL table in detail and how we implement it in TiDB. What's more, I'll share some examples of how the TTL table can be used in open source projects, including dimension reports, Kubernetes long term events storage, MQTT for IoT and others. Time waits for no one, so let's get started.

The TiDB community has made many efforts to limit TiDB storage cost. For example, we explored ways to manage data storage hierarchically, which allows the database to store cold data on cheaper storage media. We also wanted to reduce cost by increasing the value of data TiDB stores. In many cases, the value of a specific dataset is closely associated with its lifetime. The older the data, the less valuable it becomes. To help TiDB store high-value data, we introduce TTL into TiDB.

Leave a Comment