Llama 3.3 70B Instruct — Benchmarks

Benchmark scores for Llama 3.3 70B Instruct aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Overall rank: #47 of 73 open modelscomposite 40.1/100 across 6 benchmarks in 3 categories · methodology

Knowledge

Benchmark	Score	Open rank	All models
MMLU-Pro	65.9%	#40 / 119	#115 / 259
MMLU	86.3%	#2 / 76	#7 / 136

Math

Benchmark	Score	Open rank	All models
AIME 2024/2025	5.1%	#24 / 34	#133 / 155
MATH Level 5	41.6%	#16 / 32	#71 / 108

Reasoning

Benchmark	Score	Open rank	All models
SimpleBench	19.9%	#18 / 19	#84 / 90
GPQA Diamond	47.4%	#24 / 46	#130 / 182

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.