Granite Models — Hardware Requirements
9 Granite models from IBM and the community, from the smallest that runs in 1.4 GB of VRAM up to 32.2B parameters. Every row links to full quantization tables and GPU compatibility.
All Granite Models by Size
| Model | Params | Runs from | Context | Publisher | Quant downloads |
|---|---|---|---|---|---|
| Granite 4.1 3B | 3.4B | 1.6 GB | 131K | ||
| Granite 4.0 Micro | 3.4B | 1.4 GB | 131K | ||
| Granite Switch 4.1 3B Preview | 4.1B | 8.8 GB | 131K | ||
| Granite 3.2 8B Instruct | 8.2B | 4.1 GB | 131K | ||
| Granite Guardian 3.2 8B Factuality Detection | 8.2B | 4.1 GB | 131K | ||
| Granite 3.3 8B Instruct | 8.2B | 2.9 GB | 131K | ||
| Granite Switch 4.1 8B Preview | 9.6B | 19.8 GB | 131K | ||
| Granite 4.1 30B | 28.9B | 13.1 GB | 131K | ||
| Granite Switch 4.1 30B Preview | 32.2B | 65.3 GB | 131K |
Frequently Asked Questions
- How much VRAM do I need to run a Granite model?
- The smallest Granite model, Granite 4.0 Micro, runs from 1.4 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
- Which Granite models can I run on a 16 GB GPU?
- 7 of 9 Granite models fit in 16 GB of VRAM at some quantization, including Granite 4.1 3B, Granite 4.0 Micro, Granite 3.3 8B Instruct.
- What is the most popular Granite model to run locally?
- Granite 4.1 3B is the most downloaded Granite model in local-friendly quantized formats. It runs from 1.6 GB of VRAM.