DeepSeek v3 — Benchmarks

Benchmark scores for DeepSeek v3 aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Overall rank: #33 of 73 open modelscomposite 50/100 across 11 benchmarks in 4 categories · methodology

Coding

Benchmark	Score	Open rank	All models
Aider Polyglot	48.4%	#9 / 18	#40 / 69
SWE-bench Lite	36.7%	#2 / 3	#34 / 80

Knowledge

Benchmark	Score	Open rank	All models
MMLU	87.2%	#1 / 76	#3 / 136
HellaSwag	88.9%	#3 / 42	#5 / 76
MMLU-Pro	75.9%	#27 / 119	#83 / 259

Math

Benchmark	Score	Open rank	All models
AIME 2024/2025	15.8%	#18 / 34	#113 / 155
MATH Level 5	64.8%	#10 / 32	#50 / 108
FrontierMath	1.7%	#10 / 12	#85 / 101

Reasoning

Benchmark	Score	Open rank	All models
GPQA Diamond	56.5%	#16 / 46	#109 / 182
BIG-Bench Hard	87.5%	#1 / 37	#2 / 50
SimpleBench	18.9%	#19 / 19	#85 / 90

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.