GLM 4 Models — Hardware Requirements
13 GLM 4 models from zai-org and the community, from the smallest that runs in 3.2 GB of VRAM up to 358.3B parameters. Every row links to full quantization tables and GPU compatibility.
All GLM 4 Models by Size
| Model | Params | Runs from | Context | Publisher | Quant downloads |
|---|---|---|---|---|---|
| GLM 4 9B 0414 | 9.4B | 4.4 GB | 33K | ||
| GLM 4.6V Flash | 10.3B | 3.2 GB | 131K | ||
| GLM 4.7 Flash REAP 23B A3B | 23.0B | 7.4 GB | 203K | ||
| GLM 4.7 Flash Heretic | 29.9B | 13.8 GB | 203K | ||
| GLM 4.7 Flash Heretic 1.2.0 | 29.9B | 13.8 GB | 203K | ||
| GLM 4.7 Flash Ultimate Irrefusable Heretic | 29.9B | 13.8 GB | 203K | ||
| GLM 4.7 Flash | 31.2B | 9.7 GB | 203K | ||
| GLM 4.6V | 107.7B | 30.1 GB | 131K | ||
| GLM 4.5 Air | 110.5B | 30.8 GB | 131K | ||
| GLM 4.5 Air Derestricted | 110.5B | 47.4 GB | 131K | ||
| GLM 4.6 | 356.8B | 98.7 GB | 203K | ||
| GLM 4.6 Derestricted v3 | 356.8B | 152.3 GB | 203K | ||
| GLM 4.7 | 358.3B | 99.2 GB | 203K | ||
| GLM 4.5 | 358.3B | 99.2 GB | 131K |
How GLM 4 Compares — Benchmark Rating
GLM 4.6V is the highest-rated GLM 4 model with an overall benchmark rating of 54.7/100 — #28 among 75 open models. The top proprietary model, GPT 5.5, scores 88.8. Click a model to see its full benchmark breakdown.
GPT 5.5 · proprietary88.8
Claude Opus 4.7 · proprietary87.6
Claude Fable 5 · proprietary86.6
GPT 5.4 · proprietary86.6
Claude Opus 4.8 · proprietary84.4
DeepSeek V4 Pro77.5
Qwen3.6 27B74.0
StableBeluga269.1
MiniMax M2.768.4
GLM 4.6V54.7
GLM 4.650.8
GLM 4.750.1
Frequently Asked Questions
- How much VRAM do I need to run a GLM 4 model?
- The smallest GLM 4 model, GLM 4.6V Flash, runs from 3.2 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
- Which GLM 4 models can I run on a 16 GB GPU?
- 7 of 14 GLM 4 models fit in 16 GB of VRAM at some quantization, including GLM 4.7 Flash, GLM 4.6V Flash, GLM 4.7 Flash REAP 23B A3B.
- What is the most popular GLM 4 model to run locally?
- GLM 4.7 Flash is the most downloaded GLM 4 model in local-friendly quantized formats. It runs from 9.7 GB of VRAM.
- How do GLM 4 models score on benchmarks?
- GLM 4.6V leads the family with an overall benchmark rating of 54.7/100, ranking #28 among 75 open models, while the top proprietary model, GPT 5.5, scores 88.8. See the comparison chart above for the full standings.