Atomic linking is at the heart of HVM's implementation: it is what allows threads to collaborate towards massive parallelism. All major HVM versions s

VictorTaelin / hvm3_atomic_linker.md

submited by
Style Pass
2024-11-02 00:00:06

Atomic linking is at the heart of HVM's implementation: it is what allows threads to collaborate towards massive parallelism. All major HVM versions started with a better atomic linker. From slow, buggy locks (HVM1), to AtomicCAS (HVM1.5), to AtomicSwap (HVM2), the algorithm became simpler and faster over the years.

On the initial HVM3 implementation, I noticed that one of the cases on the atomic linker never happened. After some reasoning, I now understand why, and that information can be used to greatly simplify the algorithm. The result, I believe, is, for the first time, an "Optimal Atomic Linker", in the sense it can be performed with the minimal amount of CPU instructions. While the changes look small, I believe this optimal shape will result in significant improvements in performance, as it is the most frequent operation when evaluating HVM.

The main insight is that we now treat negative and positive terms differently. On interactions, instead of taking both (i.e., swapping by 0), we take only the positive term. Then, we 'move' it into the location of the negative term, which will then call 'link' if needed. I'll document the algorithm below, and provide a very informal proof.

Leave a Comment
Related Posts