SQLite is a very solid, small and fast database that is widely used by many applications. It is self contained (in a file) and doesn't require a lot

Simple, affordable, log management and analysis.

submited by
Style Pass
2024-05-14 03:30:02

SQLite is a very solid, small and fast database that is widely used by many applications. It is self contained (in a file) and doesn't require a lot of management compared to a traditional database (like MySQL). A busy SQLite database can easily handle GBs and even TBs of data, which can end up using a lot of space. And SQLite does not provide built-in compression functionality. To solve this problem, one of the solutions is to store the SQLite database inside a transparent and compressed file system, like ZFS or BRTFS. In this article, we will test it with ZFS as they support multiple compression algorithms and is available out of box on most Linux distributions and FreeBSD.

What's the size difference of a database in a compressed ZFS folder vs the original size? It really depends on the type of data you are storing, but let's do some tests on some of the databases that we have. First example, for a database stored at DB1.db, with mostly long text data (logs): $ ls -lh /uncompressed/ext4/
DB1: -rw-r--r-- 6.5G May 12 16:55 /uncompressed/ext4/DB1.db

$ sqlite3 /uncompressed/ext4/DB1.db
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
sqlite  SELECT count(*) from OURROW;
2798011 So as you can see, the uncompressed DB1 has 6.5G in size and 2,798,011 total entries. Since SQLite is contained in a file, we can move the database to a compressed folder: $ cp -pr /uncompressed/ext4/DB1.db /compressed/zfs/lz4/DB1.db
$ du -h /compressed/zfs/lz4/DB1.db
586M /compressed/zfs/lz4/DB1.db And look at the difference. It went down from 6.5G to 586M, that's about 1/10 of size reduction. And the size shrunk even more when using the zstd-fast compression algorithm: $ du -h /compressed/zfs/zstd/DB1.db
396M /compressed/zfs/zstd/DB1.db That's a significant difference which seems to work very well for the type of data we are storing in those tabled (long text content). Reducing the data to ~1/10 of the original size is a big win for us. We tested with other types of data and we have seeing it range from 1/10 to about 1/3 of the original size. We also tried with a bigger SQLite database that we had and those were the results:: 112G /uncompressed/ext4/DB2.db
36G /compressed/zfs/lz4/DB2.db
27G /compressed/zfs/zstd/DB2.db The savings were not as big as the other one, but ~1/3 reduction for lz4 and ~1/4 reduction for zstd is still pretty good. On all our tests, the zstd compressed better than lz4. For reference, those are all our compression settings on ZFS: # zfs get all |grep compression
..
trunc-logging7/zfs/lz4 compression lz4 inherited from trunc-testing
trunc-logging7/zfs/zstd compression zstd-fast local
..

The storage gains are clear, but what are the impacts to the performance of SQLite? To test that, we decided to run sqlite on all 3 databases (uncompressed, compressed with lz4 and compressed with zstd-fast) to see the differences in query time. We enabled ".timer ON" on sqlite to allow is to track the running time of the queries. The database is just a dump of web logs (from nginx) on a FTS5 (full-text search) table with no indexes or optimizations. The goal was to make SQLite interate through all the entries to properly compare the results. Let's see how it goes.

Leave a Comment