DeepSWE Preview — Benchmarks

Benchmark scores for DeepSWE Preview aggregated from public leaderboards, with how it ranks among open models. See hardware requirements for what you need to run it.

Coding

BenchmarkScoreOpen rankAll models
SWE-bench Verified58.8%#10 / 13#79 / 163

Scores aggregated from public benchmark sources (each linked from the benchmark pages). llmrun does not run these benchmarks.