Technology
DiLoCo: Distributed Model Training in Gonka
Large language models like GPT or Qwen are trained on huge clusters of GPUs connected by ultra-fast channels. DiLoCo (Distributed Local Computation) changes the game – it allows training such models over the regular internet, without a single data center.
Why Distributed Training is Needed
Modern AI models contain hundreds of billions of parameters. Training such a model requires hundreds of GPUs working synchronously. The traditional approach is to assemble all GPUs in one data center and connect them with InfiniBand. This is expensive, limits scale, and creates a single point of failure. DiLoCo allows distributed training across clusters in different parts of the world.
How DiLoCo Works
Each GPU cluster (e.g., 8xH100) trains the model locally using the AdamW optimizer. Approximately every ~1,000 steps, the clusters synchronize with each other via a global optimizer (Nesterov momentum). Synchronization requires minimal bandwidth – a regular internet channel is sufficient. This is radically different from the classic approach where GPUs exchange data at every step.
What this means for the Gonka network
Thanks to DiLoCo, Gonka can train models with 30-50 billion parameters using host GPUs scattered all over the world. No single data center is needed – just clusters of 8 GPUs with an internet connection. This makes AI training truly decentralized and opens the way for models trained by the community itself.
DiLoCo is a technology for training AI models over the internet. GPU clusters work independently and synchronize rarely, allowing Gonka to train models without a centralized data center.
Want to learn more?
Understand the GNK economy or start earning right now.
Read also
Technology
What is Proof of Work 2.0
Technology
Proof-of-Computation V2: how node honesty is verified
Basic Concepts
What is inference
Tokenomics
How GNK is distributed: genesis vs revenue
Security
Collateral and slashing: how the network is protected
Tokenomics
Vesting: why rewards aren't immediate
Architecture
What are Transfer Agents in Gonka
Technology
Sprint: How Gonka's Consensus Works
Basic Concepts
What is GNK: The Gonka Network Token
Basic Concepts
Epoch in Gonka: Network Time Unit
Tokenomics
Community Pool: The Gonka Ecosystem's Common Fund