Llama 3.1 Tulu 3 70B DPO — Benchmarks

Benchmark scores for Llama 3.1 Tulu 3 70B DPO aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Math

BenchmarkScoreRank
AIME 2024/20254.4%#20 / 29
MATH Level 542.7%#14 / 31

Reasoning

BenchmarkScoreRank
GPQA Diamond46.3%#20 / 41

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.