Knowledge Base Sections ▾
For Beginners
For Investors
- Where does GNK token value come from
- Gonka vs Competitors: Render, Akash, io.net
- The Libermans: from biophysics to decentralized AI
- GNK Tokenomics
- Risks and Prospects of Gonka: Objective Analysis
- Gonka vs Render Network: Detailed Comparison
- Gonka vs Akash: AI Inference vs Containers
- Gonka vs io.net: Inference vs GPU Marketplace
- Gonka vs Bittensor: A Detailed Comparison of Two Approaches to AI
- Gonka vs Flux: Two Approaches to Useful Mining
- Governance in Gonka: How a Decentralized Network is Managed
Technical
Analytics
Tools
- Cursor + Gonka AI - cheap LLM for coding
- Claude Code + Gonka AI - LLM for the terminal
- OpenClaw + Gonka AI - affordable AI agents
- OpenCode + Gonka AI - free AI for code
- Continue.dev + Gonka AI - AI for VS Code/JetBrains
- Cline + Gonka AI - AI agent in VS Code
- Aider + Gonka AI - pair programming with AI
- LangChain + Gonka AI - AI applications for pennies
- n8n + Gonka AI - automation with cheap AI
- Open WebUI + Gonka AI - your own ChatGPT
- LibreChat + Gonka AI — open-source ChatGPT
- API quick start — curl, Python, TypeScript
- JoinGonka Gateway — a full overview
- Management Keys — SaaS on Gonka
- Cheapest AI API: Provider Comparison 2026
Technology
Qwen3-235B: The Model Mined by Gonka
What is Qwen3-235B
Qwen3-235B-A22B-Instruct-2507-FP8 is a large language model (LLM) from the Qwen3 family, developed by the Qwen team at Alibaba Cloud. The full name is deciphered as follows: Qwen3 — the third generation of the series, 235B — 235 billion parameters in total, A22B — 22 billion active parameters per request, Instruct — a version trained to follow instructions, 2507 — July 2025 release, FP8 — 8-bit quantization for memory optimization.
The key architectural feature is MoE (Mixture of Experts). Unlike 'dense' models (GPT-5.4, Claude Sonnet 4.5) where every token passes through all parameters, an MoE model activates only a subset of 'experts' — specialized neural network blocks — for each request. In the case of Qwen3-235B, out of 235 billion parameters, only 22 billion are activated per token — less than 10%. This delivers the quality level of models with 200B+ parameters at the computational cost of a 22B model.
Practically, this means the model is smarter than one might expect from its speed. It processes requests significantly faster than dense models of comparable quality, while requiring dramatically less VRAM for inference. This is why MoE became the dominant architecture for the largest models of 2025-2026.
The context window of Qwen3-235B is 131,072 tokens (~100,000 words) — enough to analyze entire books, codebases, or long legal documents in a single query. The model supports 119 languages, including Russian, English, Chinese, Arabic, Hindi, and dozens of others — making it one of the most multilingual models on the market.
Characteristics and Benchmarks
Qwen3-235B competes with the largest closed and open models. Here's a comparison of key characteristics:
| Model | Parameters | Context | MoE | Open Source | Price (per 1M tokens) |
|---|---|---|---|---|---|
| Qwen3-235B (via JoinGonka) | 235B (22B active) | 131K | Yes | Yes (Apache 2.0) | $0.001 |
| GPT-5.4 (OpenAI) | ~1.8T (estimate) | 128K | Yes (presumed) | No | $2.50 |
| Claude Sonnet 4.5 (Anthropic) | Undisclosed | 200K | No (presumed) | No | $3.00 |
| Llama 4 Maverick (Meta) | 400B (17B active) | 1M | Yes | Yes (Llama License) | $0.20+ (hosting) |
| DeepSeek-R1 (DeepSeek) | 671B (37B active) | 128K | Yes | Yes (MIT) | $0.55 |
Qwen3-235B demonstrates a level of quality comparable to GPT-5.4 and Claude Sonnet 4.5 on most benchmarks, while its cost via JoinGonka Gateway is 2,500 times lower than that of GPT-5.4. This is possible due to two factors: the MoE architecture reduces computational costs, and the decentralized Gonka network eliminates data center margins.
On MMLU-Pro, HumanEval, MATH-500, and GSM8K benchmarks, the model ranks among the top three open-source models, trailing only DeepSeek-R1 in mathematical reasoning tasks. In code generation, translation, and instruction-following tasks, Qwen3-235B consistently outperforms Llama 4 Maverick and is comparable to Claude Sonnet 4.5.
How Gonka Uses Qwen3-235B
The Qwen3-235B model operates in the Gonka network in a distributed manner — via the DiLoCo protocol, adapted for inference. The full model in FP8 format requires approximately 640 GB of video memory (VRAM), which cannot fit on a single GPU — even H100 80GB or H200 141GB is insufficient. Therefore, the model is split by layers (tensor parallelism + pipeline parallelism) among several MLNode.
In practice, Qwen3-235B runs on a cluster of 8-16 GPU nodes, each with a minimum of 40 GB VRAM. Transfer Agents route the request to the required cluster, vLLM on each node processes its fragment of the model, results are aggregated and returned to the user. The entire process takes hundreds of milliseconds — the user does not feel that their request is processed by a dozen GPUs at different points on the planet.
An important technical detail: Gonka uses vLLM as the engine for serving. vLLM is an open-source project that provides high-performance text generation through PagedAttention — an algorithm that optimizes VRAM usage when processing multiple requests in parallel. This allows the network to serve thousands of concurrent users without quality degradation.
The model supports native tool calling — calling functions and tools directly from the model's response. This capability was added in Gonka via PR #767 with a threshold of 0.958 for detecting tool calls. This means developers can build AI agents that interact with external APIs, databases, and tools — all through a single request to Qwen3-235B.
The current Gonka network has over 4,000 GPUs (H100, H200, A100, RTX 4090 and others), combined into 120+ MLNode. This is one of the largest distributed GPU networks for AI inference in the world — and all this power is directed at serving Qwen3-235B.
How to Try Qwen3-235B
The easiest way to try Qwen3-235B is through the JoinGonka API Gateway. The Gateway provides an OpenAI-compatible API, which means any code written for OpenAI works with Qwen3-235B without changes — just replace the URL and API key.
Example request:
curl https://gate.joingonka.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-235b-a22b",
"messages": [{"role": "user", "content": "Explain MoE architecture"}]
}'Cost: $0.001 per 1 million tokens — this is 2,500 times cheaper than GPT-5.4 ($2.50/1M) and 3,000 times cheaper than Claude Sonnet 4.5 ($3.00/1M). Upon registration, you receive free 10 million tokens for testing.
The Gateway is compatible with popular development tools: Quick Start describes connection via Python, Node.js, and curl. IDE integrations are also supported — Cursor, Continue, Cline, Aider, and Claude Code — and frameworks for AI agents: LangChain, n8n, LibreChat, Open WebUI.
For a quick start:
- Register on gate.joingonka.ai (connect a wallet or create a new one)
- Get an API key in the Dashboard
- Replace
api.openai.comwithgate.joingonka.ai/apiin your code - Use the model
qwen3-235b-a22b
Qwen3-235B through JoinGonka — is enterprise-level AI at the price of a hobby project.