Llama 3.3 70B Instruct — Benchmarks
Benchmark scores for Llama 3.3 70B Instruct aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.
Coding
| Benchmark | Score | Rank |
|---|---|---|
| LiveBench Coding | 36.6 | #8 / 13 |
Knowledge
| Benchmark | Score | Rank |
|---|---|---|
| MMLU | 86.3% | #1 / 36 |
Math
| Benchmark | Score | Rank |
|---|---|---|
| AIME 2024/2025 | 5.1% | #15 / 22 |
| MATH Level 5 | 41.6% | #12 / 23 |
| LiveBench Math | 42.2 | #9 / 13 |
Reasoning
| Benchmark | Score | Rank |
|---|---|---|
| GPQA Diamond | 47.4% | #15 / 28 |
| LiveBench Reasoning | 50.8 | #7 / 13 |
| SimpleBench | 19.9% | #10 / 10 |
Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.