Llama 4 Scout

MetaMULTIMODAL Open weight
Provider
Meta
Modality
MULTIMODAL
Parameters
109B
Context window
10,000,000 tokens
Weights
Open
Released
5 Apr 2025
Compare Llama 4 Scout with others

Llama 4 Scout is the lightweight, single-GPU Llama 4 with a record 10M-token context — the most self-hostable open multimodal model for cost-constrained teams.

Best for

  • Industry-leading 10M-token context for long docs / whole codebases
  • Single-GPU self-hosting — the most deployable Llama 4
  • Cost-efficient multimodal assistant (17B active)

How it compares — and the India angle

  • Its 10M context + single-GPU footprint make it the most practical Llama 4 for Indian SMEs/colleges with limited hardware.
  • Open-weight and free — self-host in-region for full data control vs a US API.
  • Independent tests found real-world long-context retrieval weaker than the 10M headline — validate on your own data.

Benchmarks

Representative public scores (approximate, higher is better) for relative comparison. Check the provider for the latest official results.

MMLU80
GPQA Diamond57

How to access

Free to self-host (Llama 4 community licence). 17B active / 109B total MoE; fits a single high-end GPU.

Access Llama 4 Scout
Share: