In general compression utilities such as zip, gzip do not compress short strings well and often expand them. They also use lots of memory which makes

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-05-07 17:30:09

In general compression utilities such as zip, gzip do not compress short strings well and often expand them. They also use lots of memory which makes them unusable in constrained environments like Arduino. So Unishox algorithm was developed for individually compressing (and decompressing) short strings.

This is a C/C++ library. See here for CPython version and here for Javascript version which is interoperable with this library.

The contenders for Unishox are Smaz, Shoco, Unicode.org's SCSU and BOCU (implementations here and here) and AIMCS (Implementation here).

Note: Unishox provides the best compression for short text and not to be compared with general purpose compression algorithm like lz4, snappy, lzma, brottli and zstd.

The next version Unishox3 which includes multi-level static dictionaries residing in RAM or Flash memory provides much better compression than Unishox2. A preview is available in Unishox3_Alpha folder and a make file is available. To compile please use the following steps:

This is just a preview and the specification and dictionaries are expected to change before Unishox3 will be released. However, this folder will be retained so if someone used it for compressing strings, they can still use it for decompressing them.

Leave a Comment