DeepSeek R1 — Benchmarks
Benchmark scores for DeepSeek R1 aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.
Coding
| Benchmark | Score | Rank |
|---|---|---|
| Aider Polyglot | 56.9% | #5 / 12 |
| LiveBench Coding | 66.7 | #3 / 13 |
Math
| Benchmark | Score | Rank |
|---|---|---|
| AIME 2024/2025 | 53.3% | #6 / 22 |
| MATH Level 5 | 93.0% | #2 / 23 |
| LiveBench Math | 80.7 | #1 / 13 |
Reasoning
| Benchmark | Score | Rank |
|---|---|---|
| GPQA Diamond | 69.2% | #7 / 28 |
| ARC-AGI | 15.8% | #4 / 7 |
| LiveBench Reasoning | 83.2 | #2 / 13 |
| SimpleBench | 30.9% | #6 / 10 |
Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.