On developing sdxtra I encountered a problem to calculate cohesive digests for numeric values, agnostic to their types. This post is a brief summary of the solution I came up with.
In most programming languages, numeric values are usually represented as various types, such as float64, int32, uint64, etc., to meet different requirements of precision, range, and memory usage. Internally, these types may differ in bit lengths, layouts or signedness. Therefore, even referring to the same number in math, they may have different binary representations.
Digesting is a process that maps arbitary data into a fixed-length string, called a digest. It ensures same values are always associated with the same digest string, while different values would hopefully map to uncollided ones. Such property enables us to compare values by comparing their digests, which is useful in many scenarios like data deduplication, data indexing, etc.
During the digesting process, one would first convert the input data into a byte sequence following a designated scheme, with which some seasoned algorithms like SHA-256 further calculate the digest string. The scheme one adopts should be carefully chosen, such that the resulting byte sequence faithfully encodes the information of the input data.