If you're developing an application and find yourself running a benchmark whose results are measured in nanoseconds... you should probably stop and get back to more important tasks. But here we are.
I'm using binary vector embeddings to build Scour, a service that scours noisy feeds for content related to your interests. Scour uses the Hamming Distance to calculate the similarity between users' interests and each piece of content. (As a refresher, the Hamming Distance between two bit vectors is simply the number of bits that are set differently between the two.) I got nerd sniped into wondering which Hamming Distance implementation in Rust is fastest, learned more about SIMD and auto-vectorization, and ended up publishing a new (and extremely simple) implementation: hamming-bitwise-fast.
(Note that we are not comparing the distances, stringzilla, or triple_accel crates because those calculate the Hamming distance between strings rather than bit-vectors.)