Lawrence Livermore National Laboratory, Sandia National Laboratories, and Los Alamos National Laboratory are known by the shorthand “Tri-Labs” in the HPC community, but these HPC centers perhaps could be called “Try-Labs” because they historically have tried just about any new architecture to see what promise it might hold in advancing the missions of the US Department of Energy.
Sandia, which is where the Vanguard program to test out novel architectures is hosted, is coming back for seconds with the third generation of waferscale systems from Cerebras Systems in the hopes of pushing the performance barriers of traditional HPC codes on a machine that is actually designed to run AI training and inference.
Two years ago, Sandia acquired an unknown number of CS-2 systems from Cerebras, each of which has a CPU host and a WSE-2 waferscale processor, with the idea of offloading some matrix-dense HPC calculations to the 16-bit floating point cores on the WSE-2 engine.
Why would Sandia even think about cutting the precision of its calculations from either 64-bit or 32-bit formats by a factor of four or two? Because those WSE-2 engines, as we detailed back in March 2022, cram 850,000 cores and 40 GB of on-chip SRAM memory to feed them, etched in 2.6 trillion transistors, into a square of silicon that is the size of a dinner plate and that has 20 PB/sec of memory bandwidth and 6.25 petaflops of oomph on dense matrices and 62.5 petaflops on sparse matrices.