Phi 4 Reasoning — Benchmarks

Benchmark scores for Phi 4 Reasoning aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Knowledge

Benchmark	Score	Open rank	All models
MMLU-Pro	74.3%	#22 / 97	#89 / 260

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.