Ampere Readies 256-Core CPU Beast, Awaits The AI Inference Wave

submited by

Style Pass

2024-04-17 16:30:02

For the past two decades, the game in compute engines has been to try to pack as many cores and additional functionality as possible into a socket and make the overall system price/performance come down per unit of power consumed and heat dissipated.

The first dual-core processors entered the datacenter in 2001 before Denard scaling of chip clock speeds more or less ceased around four years later, which is the last free ride in architectural enhancements that chip architects had. Moore’s Law was still going strong back then, but was clearly entering middle age as the cost of transistors for each manufacturing node kept getting smaller and smaller, but at a decreasing rate. The cost per transistor started going up rather than down around the 10 nanometer barrier, and it will continue for the foreseeable future until we find an alternative to CMOS chip etching. Which probably means as long as any of us of a certain age will care.

And so, we want more and more cores on our compute engines, and the socket full of chiplets is becoming the motherboard, like a black hole sucking the surrounding components into it, because anything that can keep the signaling inside the socket increases computational and economic efficiency even if the move to chiplets is creating all kinds of havoc with power and thermals. The interconnects are eating an increasing share of the socket power budget, but moving to chiplets increases yield and therefore lowers manufacturing costs and allows a kind of flexibility that we think the industry wants. Why should your compute engine socket only come with components from one chip maker? Your motherboard never did.