GPT OSS 120B — Benchmarks
Benchmark scores for GPT OSS 120B aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.
Coding
| Benchmark | Score | Rank |
|---|---|---|
| Terminal-Bench | 14.2% | #10 / 11 |
Reasoning
| Benchmark | Score | Rank |
|---|---|---|
| SimpleBench | 22.1% | #9 / 10 |
Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.