Qwen Models — Hardware Requirements
21 Qwen models from deepcogito and the community, from the smallest that runs in 0.8 GB of VRAM up to 72.3B parameters. Every row links to full quantization tables and GPU compatibility.
All Qwen Models by Size
| Model | Params | Runs from | Context | Publisher | Quant downloads |
|---|---|---|---|---|---|
| SpatialLM1.1 Qwen 0.5B | 604M | 1.5 GB | 33K | ||
| Qwen1.5 0.5B Chat | 620M | 0.8 GB | 33K | ||
| Nemotron Research Reasoning Qwen 1.5B | 1.8B | 1.1 GB | 131K | ||
| Qwen 1 8B | 1.8B | 0.9 GB | 8K | ||
| Qwen1.5 1.8B | 1.8B | 1.5 GB | 33K | ||
| Qwen1.5 MoE A2.7B Chat | 2.7B | 1.9 GB | 33K | ||
| Qwen35 4B Soyuz Merged | 4B | 8.5 GB | 262K | ||
| CyberSecQwen 4B | 4.0B | 2.2 GB | 262K | ||
| CodeQwen1.5 7B | 7.3B | 3.5 GB | 66K | ||
| Qwen1.5 7B Chat | 7.7B | 4.7 GB | 33K | ||
| Qwen1.5 7B | 7.7B | 4.7 GB | 33K | ||
| Qwen 7B | 7.7B | 3.6 GB | 33K | ||
| Qwen Marketing | 8.2B | 18.0 GB | — | ||
| Qwen1.5 14B Chat | 14.2B | 8 GB | 33K | ||
| Qwen 14B Chat | 14.2B | 6.6 GB | 8K | ||
| Qwen1.5 14B | 14.2B | 8 GB | 33K | ||
| Qwen 14B | 14.2B | 6.6 GB | 8K | ||
| Qwen1.5 MoE A2.7B | 14.3B | 6.8 GB | 8K | ||
| Cogito V1 Preview Qwen 32B | 32B | 10.4 GB | 131K | ||
| XiYanSQL QwenCoder 32B 2504 | 32B | 14.4 GB | 33K | ||
| Qwen1.5 32B Chat | 32.5B | 14.3 GB | 33K | ||
| Qwen1.5 32B | 32.5B | 14.3 GB | 33K | ||
| Qwen1.5 72B Chat | 72.3B | 35.5 GB | 33K |
How Qwen Compares — Benchmark Rating
Qwen 14B is the highest-rated Qwen model with an overall benchmark rating of 56.6/100 — #20 among 75 open models. The top proprietary model, GPT 5.5, scores 88.8. Click a model to see its full benchmark breakdown.
GPT 5.5 · proprietary88.8
Claude Opus 4.7 · proprietary87.6
Claude Fable 5 · proprietary86.6
GPT 5.4 · proprietary86.6
Claude Opus 4.8 · proprietary84.4
DeepSeek V4 Pro77.5
Qwen3.6 27B74.0
StableBeluga269.1
MiniMax M2.768.4
Qwen 14B56.6
Qwen 14B Chat56.5
Qwen 7B41.1
Qwen 1 8B17.9
Frequently Asked Questions
- How much VRAM do I need to run a Qwen model?
- The smallest Qwen model, Qwen1.5 0.5B Chat, runs from 0.8 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
- Which Qwen models can I run on a 16 GB GPU?
- 21 of 23 Qwen models fit in 16 GB of VRAM at some quantization, including Cogito V1 Preview Qwen 32B, Qwen1.5 0.5B Chat, Qwen1.5 14B Chat.
- What is the most popular Qwen model to run locally?
- Cogito V1 Preview Qwen 32B is the most downloaded Qwen model in local-friendly quantized formats. It runs from 10.4 GB of VRAM.
- How do Qwen models score on benchmarks?
- Qwen 14B leads the family with an overall benchmark rating of 56.6/100, ranking #20 among 75 open models, while the top proprietary model, GPT 5.5, scores 88.8. See the comparison chart above for the full standings.