DeepSeek R1 — Benchmarks

Benchmark scores for DeepSeek R1 aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Coding

BenchmarkScoreRank
Aider Polyglot56.9%#5 / 12
LiveBench Coding66.7#3 / 13

Math

BenchmarkScoreRank
AIME 2024/202553.3%#6 / 22
MATH Level 593.0%#2 / 23
LiveBench Math80.7#1 / 13

Reasoning

BenchmarkScoreRank
GPQA Diamond69.2%#7 / 28
ARC-AGI15.8%#4 / 7
LiveBench Reasoning83.2#2 / 13
SimpleBench30.9%#6 / 10

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.