It has been a minute1 since the last update on our Open Collective and Patreon campaigns. So I thought it’s a good idea to have a slightly more formal write-up on the progress we made in the past year.
Atomic instructions are commonly seen in modern architectures to perform indivisible operations. However, historically speaking, atomic instructions have never really been a thing for m68k, since processors in this family are predominantly single-core2, which is the model we primarily focus on in this project. That said, as a backend we still need to lower atomic instructions passing from earlier stages in the compilation pipeline. Otherwise, LLVM will simply bail out with a crash.
For atomic load and store, the stories are a lot simpler: due to the aforementioned single-core nature, lowering them to normal MOV instructions should be sufficient, which was something D136525 did. In the same patch, the author, Sheng, also dealt with something more tricky: atomic compare-exchange (cmpxchg) and its friends, like atomic fetch-and-add (or add-and-fetch). Despite being single-core, the processor can still run multi-tasking systems. So we need to make sure an atomic cmpxchg is immune to system routines like interrupts and/or context-switching. To this end, 68020 and later processors are equipped with the CAS instruction, which can be used as the substrate for fetch-and-X instructions, in addition to implementing cmpxchg. For older processors, we expanded these instructions into lock-free library calls (i.e. __sync_val_compare_and_swap and __sync_fetch_*). In addition, this patch also lowered atomic read-modify-write (RMW) and any atomic operations larger than 32 bits into library calls of libatomic, which are not lock-free3. Last but not the least, 85b37d0 added the lowering for atomic swap operations.
D146996 was dealing with a similar puzzle: atomic fence. As mentioned before, we don’t need to worry about the memory operation order in a in-order single-core processor, like most members in 68k. Thus, this patch only needs to prevent compiler optimizations from reodering instructions across atomic fence. I believe there is definitely a more sophisticate solution, like adding dependencies (e.g. SelectionDAG chains) between instructions placed before and after the fence…but, well, I was lazy so I literally copied what m68k GCC did: lower atomic fence into an inline assembly memory barrier a.k.a asm __volatile__ ("":::"memory") (more precisely, an inline assembly instruction in LLVM’s MachineIR).