Networking is a critical part of any datacenter, especially with the rise of networking-intensive large language models. As such, it was a clear targe

Google Apollo: The >$3 Billion Game-Changer in Datacenter Networking

submited by
Style Pass
2023-03-17 22:30:04

Networking is a critical part of any datacenter, especially with the rise of networking-intensive large language models. As such, it was a clear target for Google’s infrastructure optimization efforts. Over the last year at conferences such as OFC and SIGCOMM, Google disclosed their custom networking stack, Jupiter, from in-house switches all the way through to custom reconfigurable software.

This custom networking stack enables them to save at least $3 billion versus the industry standard implemented by competitors such as Amazon and Microsoft. In addition to cost improvements, Google also achieves higher network performance and lower latencies! This custom networking stack started being deployed more than 5 years ago and is now implemented in most of Google’s datacenters. Google’s custom networking was an integral part of training their most advanced large language models, including PaLM.

Before diving too deep into how their custom networking stack works, let’s quickly discuss what it does and the industry implications. First off, Google claims their custom network improves throughput by 30%, use 40% less power, incurs 30% less Capex, reduces flow completion by 10%, and delivers 50x less downtime across their network.

Leave a Comment