There has been a lot more churn on the November Top500 supercomputer rankings that is the talk of the SC24 conference in Atlanta this week than there was in the list that came out in June at the ISC24 conference in Hamburg, Germany back in May, and there are some interesting developments in the new machinery that is being installed.
The big news, of course, is that the long-awaited “El Capitan” system being built by Hewlett Packard Enterprise with hybrid CPU-GPU compute engines from AMD is up and running and is, as expected, the new top flopper on the rankings. And by a wide margin over its competition in the United States and the rumored specifications of exascale-class machines in China.
A substantial portion of El Capitan – we don’t yet know how big of a portion yet as we write this – with 43,808 of AMD’s “Antares-A” Instinct MI300A devices (by our math) has been tested by Lawrence Livermore National Laboratory on a variety of benchmarks, including the High Performance Linpack test that has been used to rank supercomputers since 1993. The part of El Capitan that was tested using HPL has a peak theoretical performance of 2,746.4 petaflops, which is significantly higher than the 2.3 exaflops to 2.5 exaflops that we were expecting. (This is, of course, for floating point math at 64-bit precision.) The peak sustained performance on the HPL test is 1,742 petaflops, which yields a computational efficiency of 63.4 percent. This is about the level of efficiency that we expect when a new accelerated system comes to market (our touchstone is 65 percent), and we expect in subsequent rankings in 2025 that El Capitan will bring more of its theoretical capacity to bear on benchmarks as the system works its way towards acceptance by Lawrence Livermore.
As a reminder, the MI300A was revealed alongside its MI300X sibling (which has eight GPU chiplets and no CPU cores) back in December 2023. The MI300A has three chiplets with two dozen “Genoa” Epyc cores in total and six chiplets of Antares GPU streaming multiprocessors running at 1.8 GHz. In the Cray EX systems, all of the MI300A compute engines are linked to each other with HPE’s “Rosetta” Slingshot 11 Ethernet interconnect. All told, there are 1.05 million Genoa cores and just a hair under 10 million streaming multiprocessors on the GPU chiplets in the section of El Capitan tested. This is obviously an enormous amount of concurrency to manage. But it is not crazy. The Sunway “TaihuLight” supercomputer at the National Supercomputing Center in Wuxi, China, which has been on the Top500 rankings since 2016 and is still the fifteenth most powerful machine in the world (of those tested using HPL at least) had a total of 10.65 million cores.