Math
FrontierMath Leaderboard
FrontierMath is a benchmark of exceptionally hard, original research-level mathematics problems created with professional mathematicians. Even the strongest models solve only a small fraction, making it a frontier measure of genuine mathematical ability.
Source: epoch3 open models ranked+97 proprietaryData through May 2026
Open models ranked on FrontierMath
# shows rank among open models / rank overall (including proprietary).
| # | Model | Score |
|---|---|---|
| 1 / 20 | GLM 5.1 · 753.9B | 33.5% |
| 2 / 47 | GLM 5 · 753.9B | 16.4% |
| 3 / 100 | Llama 4 Scout 17B 16E Instruct · 108.6B | 0.0% |
FrontierMath: frequently asked questions
- What is the best open LLM on FrontierMath?
- GLM 5.1 is the top open model on FrontierMath, scoring 33.5%. Among all models tested — including proprietary ones — it ranks #20.
- Can open models match proprietary models on FrontierMath?
- Not quite on FrontierMath: the strongest proprietary model (gpt-5.5-pro-pre-release_high) scores 52.4%, ahead of the best open model (GLM 5.1) at 33.5% — but you can run the open one yourself.
Scores aggregated from epoch. llmrun does not run this benchmark — see the source for methodology, or the about benchmarks for what it measures.