Llama 3.1 405B Instruct — Benchmarks
Benchmark scores for Llama 3.1 405B Instruct aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.
Overall rank: #44 of 75 open modelscomposite 46/100 across 5 benchmarks in 3 categories · methodology
Knowledge
| Benchmark | Score | Open rank | All models |
|---|---|---|---|
| MMLU | 84.5% | #6 / 76 | #13 / 136 |
Math
| Benchmark | Score | Open rank | All models |
|---|---|---|---|
| AIME 2024/2025 | 9.7% | #16 / 30 | #103 / 142 |
| MATH Level 5 | 49.8% | #14 / 32 | #64 / 108 |
Reasoning
| Benchmark | Score | Open rank | All models |
|---|---|---|---|
| GPQA Diamond | 50.9% | #16 / 42 | #106 / 170 |
| SimpleBench | 23.0% | #14 / 17 | #65 / 76 |
Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.