DeepSeek v3 — Benchmarks

Benchmark scores for DeepSeek v3 aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Coding

BenchmarkScoreRank
Aider Polyglot48.4%#9 / 17

Knowledge

BenchmarkScoreRank
MMLU87.2%#1 / 74
HellaSwag88.9%#3 / 41

Math

BenchmarkScoreRank
AIME 2024/202515.8%#14 / 29
MATH Level 564.8%#10 / 31
FrontierMath1.7%#10 / 12

Reasoning

BenchmarkScoreRank
GPQA Diamond56.5%#12 / 41
BIG-Bench Hard87.5%#1 / 36
SimpleBench18.9%#15 / 15

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.