We're excited to release INTELLECT-1, the first 10B parameter language model collaboratively trained across the globe. This represents a 10× sca

INTELLECT-1 Release: The First Globally Trained 10B Parameter Model

submited by
Style Pass
2024-11-29 23:00:07

We're excited to release INTELLECT-1, the first 10B parameter language model collaboratively trained across the globe. This represents a 10× scale-up from our previous research and demonstrates that large-scale model training is no longer confined to large corporations but can be achieved through distributed, community-driven approaches. The next step is scaling this even further to frontier model sizes and ultimately open source AGI.

We present the first large-scale experiment collaboratively training a 10 billion parameter model over 1 trillion tokens across five countries and three continents on up to 112 H100 GPUs simultaneously. We achieve an overall compute utilization of 83% across continents and 96% when training exclusively on nodes distributed across the entire United States, introducing minimal overhead compared to centralized training approaches.

Our results show that INTELLECT-1 can maintain training convergence and high compute utilization despite severe bandwidth constraints and node volatility, opening new possibilities for decentralized, community-driven training of frontier foundation models.

Leave a Comment