In this section, we compare using bidiff to generate patches against two versions of well-known software, versus just compressing the newer version wi

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-11-06 02:00:03

In this section, we compare using bidiff to generate patches against two versions of well-known software, versus just compressing the newer version with zstd (hereafter referred to as "naive").

The essential part is contained in the bidiff crate itself. The serialization/deserialization code is provided as a (practical) example of how to store patch files. It is very simplistic: a magic number, a version number, and then a series of ADD, COPY and SEEK instructions, with variable-length integer encoding.

Note: bidiff and bipatch do not concern themselves with compression, but patch files MUST be compressed. Uncompressed, they are slightly larger than the "newer" file (but lower-entropy).

Diffing is a memory- and CPU-intensive task. The requirements are discussed below. Patching, however, can be done in streaming fashion and requires very little memory (only small buffers to perform addition between the "older" file and parts of the "patch" file).

bidiff is based on the bsdiff algorithm as described in Naïve Differences in Executable Code (2003) by Colin Percival, but the implementation differs in a few key areas.

Leave a Comment