Mistral Small 4
Mistral AIMULTIMODAL Open weight
- Provider
- Mistral AI
- Modality
- MULTIMODAL
- Parameters
- 119B
- Context window
- 262,144 tokens
- Weights
- Open
- Released
- 16 Mar 2026
Mistral Small 4 (March 2026) is the value champion — a 119B/6.5B-active Apache-2.0 MoE that runs on one consumer GPU yet handles multimodal, reasoning and coding.
Best for
- Cost-efficient production workhorse / default model
- Unifies instruct + reasoning + multimodal + agentic coding
- ~40% faster, ~3x throughput vs Small 3 (only 6.5B active)
- High-volume, latency-sensitive, on-prem deployments
How it compares — and the India angle
- Strongest pick for Indian developers: Apache 2.0 + MoE efficiency makes self-hosting genuinely cheap.
- Self-hostable on a single RTX 4090-class GPU, avoiding USD API bills entirely.
- At $0.10/$0.30 the managed API is firmly budget-tier even if you don't self-host.
Benchmarks
Representative public scores (approximate, higher is better) for relative comparison. Check the provider for the latest official results.
GPQA Diamond71
How to access
API ~$0.10 / 1M input, ~$0.30 / 1M output. Apache 2.0; 6.5B active / 119B total MoE.
Access Mistral Small 4