While a lot of people focus on the floating point and integer processing architectures of various kinds of compute engines, we are spending more and m

How To Build A Better “Blackwell” GPU Than Nvidia Did

submited by
Style Pass
2024-03-30 04:00:07

While a lot of people focus on the floating point and integer processing architectures of various kinds of compute engines, we are spending more and more of our time looking at memory hierarchies and interconnect hierarchies. And that is because compute is easy, and data movement and memory are getting harder and harder.

To put some simple numbers on this: Over the past two decades, CPU and then GPU compute capacity has increased by a factor of 90,000X, but DRAM memory bandwidth has only increased by 30X and interconnect bandwidth has also only increased by 30X. We have gotten better in some ways in recent years, but we think that the compute-memory balance is still far out if whack, and it means we are overspending on under-memoried compute engines for a lot of AI and HPC workloads.

It is with this in mind that we consider the architectural innovations at the physical layer, or PHY, in networks that have been created by Eliyan and that are being cast in a different and very useful light this week at the MemCon 2024 conference. Co-founder and chief executive officer Ramin Farjadrad took some time to show us how the NuLink PHY and its use cases have evolved over time, and how they can be used to build better, cheaper, and more powerful compute engines than can be done with current packaging techniques based on silicon interposers.

Leave a Comment