Technology

DiLoCo: Distributed Model Training in Gonka

Large language models like GPT or Qwen are trained on huge clusters of GPUs connected by ultra-fast channels. DiLoCo (Distributed Local Computation) changes the game – it allows training such models over the regular internet, without a single data center.

Why Distributed Training is Needed

Modern AI models contain hundreds of billions of parameters. Training such a model requires hundreds of GPUs working synchronously. The traditional approach is to assemble all GPUs in one data center and connect them with InfiniBand. This is expensive, limits scale, and creates a single point of failure. DiLoCo allows distributed training across clusters in different parts of the world.

How DiLoCo Works

Each GPU cluster (e.g., 8xH100) trains the model locally using the AdamW optimizer. Approximately every ~1,000 steps, the clusters synchronize with each other via a global optimizer (Nesterov momentum). Synchronization requires minimal bandwidth – a regular internet channel is sufficient. This is radically different from the classic approach where GPUs exchange data at every step.

What this means for the Gonka network

Thanks to DiLoCo, Gonka can train models with 30-50 billion parameters using host GPUs scattered all over the world. No single data center is needed – just clusters of 8 GPUs with an internet connection. This makes AI training truly decentralized and opens the way for models trained by the community itself.

DiLoCo is a technology for training AI models over the internet. GPU clusters work independently and synchronize rarely, allowing Gonka to train models without a centralized data center.

Want to learn more?

Understand the GNK economy or start earning right now.

Gonka Network Architecture →

Read also

Technology

What is Proof of Work 2.0

Technology

Proof-of-Computation V2: how node honesty is verified

Basic Concepts

What is inference

Tokenomics

How GNK is distributed: genesis vs revenue

Security

Collateral and slashing: how the network is protected

Tokenomics

Vesting: why rewards aren't immediate

Architecture

What are Transfer Agents in Gonka

Technology

Sprint: How Gonka's Consensus Works

Basic Concepts

What is GNK: The Gonka Network Token

Basic Concepts

Epoch in Gonka: Network Time Unit

Tokenomics

Community Pool: The Gonka Ecosystem's Common Fund