Llama 3.1 Tulu 3 70B DPO — Benchmarks
Benchmark scores for Llama 3.1 Tulu 3 70B DPO aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.
Math
| Benchmark | Score | Rank |
|---|---|---|
| AIME 2024/2025 | 4.4% | #20 / 29 |
| MATH Level 5 | 42.7% | #14 / 31 |
Reasoning
| Benchmark | Score | Rank |
|---|---|---|
| GPQA Diamond | 46.3% | #20 / 41 |
Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.