Llama 3.3 70B Instruct — Benchmarks

Benchmark scores for Llama 3.3 70B Instruct aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Coding

BenchmarkScoreRank
LiveBench Coding36.6#8 / 13

Knowledge

BenchmarkScoreRank
MMLU86.3%#1 / 36

Math

BenchmarkScoreRank
AIME 2024/20255.1%#15 / 22
MATH Level 541.6%#12 / 23
LiveBench Math42.2#9 / 13

Reasoning

BenchmarkScoreRank
GPQA Diamond47.4%#15 / 28
LiveBench Reasoning50.8#7 / 13
SimpleBench19.9%#10 / 10

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.