Kimi K2.5 — Benchmarks

Benchmark scores for Kimi K2.5 aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Coding

BenchmarkScoreRank
SWE-bench Verified73.8%#3 / 4
Terminal-Bench43.2%#3 / 15
LiveBench Coding77.9#2 / 23

Knowledge

BenchmarkScoreRank
Humanity's Last Exam24.4%#1 / 4
SimpleQA33.9%#4 / 7

Math

BenchmarkScoreRank
AIME 2024/202592.2%#3 / 29
FrontierMath27.9%#3 / 12
LiveBench Math84.9#4 / 23

Reasoning

BenchmarkScoreRank
GPQA Diamond87.6%#3 / 41
ARC-AGI65.3%#1 / 10
SimpleBench46.8%#5 / 15
LiveBench Reasoning76.0#4 / 23

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.