We’re so glad you’re here. You can expect all the best TNS content to arrive  									Monday through Friday to keep you on top of the n

Need a Trillion-Parameter LLM? Google Cloud Is for You.

submited by
Style Pass
2024-11-26 04:30:04

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

At KubeCon+CloudNativeCon North America earlier this month, Google Cloud announced it had upgraded its Google Kubernetes Engine (GKE) to support clusters of up to 65,000 nodes. That’s a big leap up from its previous limit of 15,000 nodes[. This enhancement is specifically designed to meet the growing demands of training and running trillion-parameter AI Large Language Models (LLM).

How big is that? The biggest current LLM  is OpenAI’s GPT-4 with an estimated 1.7 trillion parameters. Next is Google Gemini with 1.56 trillion, and then Meta with  405 billion. According to Google, “We believe GKE offers more than 10X larger scale than the other two largest public cloud providers.” So, unless your business plan is to go toe-to-toe with the top LLMs, a trillion parameters should be more than enough.

Even a single GKE cluster can now manage AI models spread across 250,000 tensor processing units (TPUs), Google’s specialized AI processors. This is a fivefold increase from GKE’s previous benchmark, which supported 50,000 TPU chips in a single cluster.

Leave a Comment