DeepSeek v3 — Benchmarks
Benchmark scores for DeepSeek v3 aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.
Coding
| Benchmark | Score | Rank |
|---|---|---|
| Aider Polyglot | 48.4% | #9 / 17 |
Knowledge
Math
| Benchmark | Score | Rank |
|---|---|---|
| AIME 2024/2025 | 15.8% | #14 / 29 |
| MATH Level 5 | 64.8% | #10 / 31 |
| FrontierMath | 1.7% | #10 / 12 |
Reasoning
| Benchmark | Score | Rank |
|---|---|---|
| GPQA Diamond | 56.5% | #12 / 41 |
| BIG-Bench Hard | 87.5% | #1 / 36 |
| SimpleBench | 18.9% | #15 / 15 |
Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.