Qwen3-32B

AlibabaTEXT Open weight
Provider
Alibaba
Modality
TEXT
Parameters
32.8B
Context window
128,000 tokens
Weights
Open
Released
28 Apr 2025
Compare Qwen3-32B with others

Qwen3-32B is the largest dense Qwen3 (32.8B, Apache 2.0) — a predictable, single-GPU-friendly model with hybrid thinking / non-thinking modes.

Best for

  • Largest Qwen3 dense model — predictable latency, no MoE routing
  • Fits on a single high-end GPU or two consumer GPUs
  • Hybrid thinking / non-thinking modes

How it compares — and the India angle

  • Dense 32B is the simplest Qwen3 to deploy on a single Indian-startup-budget GPU.
  • Apache 2.0 with zero licensing cost and full data residency.
  • MoE siblings offer more capability-per-dollar at long context if you can handle routing.

Benchmarks

Representative public scores (approximate, higher is better) for relative comparison. Check the provider for the latest official results.

GPQA Diamond67
AIME (math)81
MMLU-Pro80

How to access

Free to self-host (Apache 2.0). Largest Qwen3 dense model.

Access Qwen3-32B
Share: