Qwen3-32B
AlibabaTEXT Open weight
- Provider
- Alibaba
- Modality
- TEXT
- Parameters
- 32.8B
- Context window
- 128,000 tokens
- Weights
- Open
- Released
- 28 Apr 2025
Qwen3-32B is the largest dense Qwen3 (32.8B, Apache 2.0) — a predictable, single-GPU-friendly model with hybrid thinking / non-thinking modes.
Best for
- Largest Qwen3 dense model — predictable latency, no MoE routing
- Fits on a single high-end GPU or two consumer GPUs
- Hybrid thinking / non-thinking modes
How it compares — and the India angle
- Dense 32B is the simplest Qwen3 to deploy on a single Indian-startup-budget GPU.
- Apache 2.0 with zero licensing cost and full data residency.
- MoE siblings offer more capability-per-dollar at long context if you can handle routing.
Benchmarks
Representative public scores (approximate, higher is better) for relative comparison. Check the provider for the latest official results.
GPQA Diamond67
AIME (math)81
MMLU-Pro80
How to access
Free to self-host (Apache 2.0). Largest Qwen3 dense model.
Access Qwen3-32B