Granite Models — Hardware Requirements

9 Granite models from IBM and the community, from the smallest that runs in 1.4 GB of VRAM up to 32.2B parameters. Every row links to full quantization tables and GPU compatibility.

All Granite Models by Size

ModelParamsContext
Granite 4.1 3B3.4B131K
Granite 4.0 Micro3.4B131K
Granite Switch 4.1 3B Preview4.1B131K
Granite 3.2 8B Instruct8.2B131K
Granite Guardian 3.2 8B Factuality Detection8.2B131K
Granite 3.3 8B Instruct8.2B131K
Granite Switch 4.1 8B Preview9.6B131K
Granite 4.1 30B28.9B131K
Granite Switch 4.1 30B Preview32.2B131K

Frequently Asked Questions

How much VRAM do I need to run a Granite model?
The smallest Granite model, Granite 4.0 Micro, runs from 1.4 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
Which Granite models can I run on a 16 GB GPU?
7 of 9 Granite models fit in 16 GB of VRAM at some quantization, including Granite 4.1 3B, Granite 4.0 Micro, Granite 3.3 8B Instruct.
What is the most popular Granite model to run locally?
Granite 4.1 3B is the most downloaded Granite model in local-friendly quantized formats. It runs from 1.6 GB of VRAM.