The last couple of years of Gen AI frenzy have brought us some undeniably cool new products, like Copilot and Suno. One thing they all have in common

The future of AI needs more flexible GPU capacity | Modal Blog

submited by

Style Pass

2024-10-25 20:30:09

The last couple of years of Gen AI frenzy have brought us some undeniably cool new products, like Copilot and Suno. One thing they all have in common is they demand lots of compute – in particular GPUs. But the supply of GPUs is constrained. This supply-demand imbalance has caused the market for cloud GPUs to behave very differently than other cloud markets.

Why is that? What should we do? And what can we expect going forward? Should startups keep buying long-term GPU reservations from cloud vendors? Or will there be other options in the future?

But training is a cost center and eventually you need to recoup that cost through real revenue. How do you do that? Enter inference – the less sexy but money-making sibling of training.

So why is most GPU demand driven by training even though inference is where you make the money? I think a lot of it reflects where we are in the cycle – there’s an expectation that the revenue potential for inference is big, but in order to get this, you have to spend a lot of money on training.

The economics of this – high upfront capital, but large potential – is something VCs understand quite well, so I think this explains why model builders have had no challenges raising lots of dollars. But for the economics of this to make sense eventually, we need to see a much larger % of GPU spend going towards inference.

Greykite: A flexible, intuitive, and fast forecasting library

Comment

The trouble with SPIR-V - Gob's blog

Comment

Intel CEO: Clients Want Custom x86-Based SoCs

Comment

6,000 GPUs: NERSC Says Perlmutter Delivers 4 Exaflops, Claims Top Spot in AI Supercomputing

Comment

The Future Of Work Will Be Five-Hour Days, A Four-Day Workweek And Flexible Staggered Schedules

Comment

How PCI-Express works and why you should care? #GPU

Comment

Chinese CPU and GPU Maker Loongson Plans to Raise $544 Million

Comment

Arm’s cheap and flexible plastic microchip could create an ‘internet of everything’

Comment

Hydroelectric drought: How climate change complicates California’s plans for a carbon-free future

Comment

Qualcomm does the bare minimum for the new Snapdragon 888 Plus SoC

Comment

The future of AI needs more flexible GPU capacity | Modal Blog

Leave a Comment

Related Posts

Greykite: A flexible, intuitive, and fast forecasting library

The trouble with SPIR-V - Gob's blog

Intel CEO: Clients Want Custom x86-Based SoCs

6,000 GPUs: NERSC Says Perlmutter Delivers 4 Exaflops, Claims Top Spot in AI Supercomputing

The Future Of Work Will Be Five-Hour Days, A Four-Day Workweek And Flexible Staggered Schedules

How PCI-Express works and why you should care? #GPU

Chinese CPU and GPU Maker Loongson Plans to Raise $544 Million

Arm’s cheap and flexible plastic microchip could create an ‘internet of everything’

Hydroelectric drought: How climate change complicates California’s plans for a carbon-free future

Qualcomm does the bare minimum for the new Snapdragon 888 Plus SoC

Recent Posts

The Bio-Marker You're Not Tracking - by DomoFutu

CVE-2024-45844: Privilege escalation in F5 BIG-IP

Why McDonald’s E. coli outbreak could be a sign of more to come

Alphabet's self-driving unit Waymo closes $5.6 billion funding round as robotaxi race heats up in the U.S.

Pro-censorship NGO working with White House to ‘kill Musk’s Twitter’ – report – CLG News

Drift Poll Winner: Pay Parents, Rational Culture

Meta’s Movie Gen model puts out realistic video with sound, so we can finally have infinite Moo Deng

Python Enhancement Proposals

No One Knows How Big Pumpkins Can Get

Boeing reportedly considers selling off its space business

The Modern RAG Stack - by Adam Khakhar - Adam’s Substack

Tube drivers plan go-slow action on noise levels

Swift-Foundation 2024 Annual Update

Ballot Fraud Scheme Under Investigation in Mesa County

There’s a Very Good Reason College Students Don’t Read Anymore

What a developer needs to know about SCIM

Microsoft CEO's pay rises 63% to $79m, despite devastating year for layoffs

Search code, repositories, users, issues, pull requests...

Thoughts on Arc - macwright.com

Search code, repositories, users, issues, pull requests...