How much VRAM do I need to run a Gemma 4 model?

The smallest Gemma 4 model, Gemma 4 E2B IT Qat Mobile Transformers, runs from 1.4 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.

Which Gemma 4 models can I run on a 16 GB GPU?

41 of 47 Gemma 4 models fit in 16 GB of VRAM at some quantization, including Gemma 4 26B A4B IT, Gemma 4 31B IT, Gemma 4 12B IT.

What is the most popular Gemma 4 model to run locally?

Gemma 4 26B A4B IT is the most downloaded Gemma 4 model in local-friendly quantized formats. It runs from 8.0 GB of VRAM.

Gemma 4 Models — Hardware Requirements

25 Gemma 4 models from Google and the community, from the smallest that runs in 2 GB of VRAM up to 32.7B parameters. Every row links to full quantization tables and GPU compatibility.

All Gemma 4 Models by Size

Model	Params	Runs from	Context	Publisher	Quant downloads
Gemma 4 E2B IT Qat Mobile Transformers	2.3B	1.4 GB	131K	Google	42.3K
Gemma 4 E4B IT Assistant	4B	2 GB	131K	Google	14.5K
OFFELLIA Gemma 4 E4B 8B Claude 4.6 Opus Reasoning MTP	4B	1.9 GB	—	Brunobkr	—
Turkish Gemma 4B T1 Scout	4.3B	2.5 GB	131K	ytu-ce-cosmos	—
Gemma 4 E2B IT Qat Q4 0 Unquantized	5.1B	2.5 GB	131K	Google	230.4K
Gemma 4 E2B IT Qat Q4 0 Unquantized Heretic	5.1B	2.5 GB	131K	coder3101	818
Gemma 4 E2B IT	5.1B	2.1 GB	131K	Google	548.6K
Gemma 4 E2B IT Uncensored	5.1B	2.5 GB	131K	TrevorJS	29.2K
Supergemma4 E4b Abliterated	7.5B	3.7 GB	131K	Jiunsong	—
Gemma 4 E4B IT Qat Q4 0 Unquantized	7.9B	3.9 GB	131K	Google	302.2K
Gemma 4 E4B IT OBLITERATED	8.0B	2.7 GB	131K	OBLITERATUS	10.0K
Gemma4 E4B MiniFantasy V1	8.0B	16.5 GB	131K	Nubinu	—
Gemma 4 E4B IT	8.0B	3.2 GB	131K	Google	1.2M
Gemma 4 E4B IT Ultra Uncensored Heretic	8.0B	3.9 GB	131K	llmfan46	88.9K
Gemma 4 E4B IT Uncensored	8.0B	3.9 GB	131K	TrevorJS	12.5K
Gemma 4 E4B Luchador	8.0B	16.5 GB	131K	rpDungeon	—
Gemma 4 12B IT Heretic	12.0B	6.1 GB	131K	igorls	36.7K
Gemma 4 12B IT	12.0B	4.8 GB	262K	Google	1.3M
Gemma 4 12B IT Qat Q4 0 Unquantized	12.0B	6.1 GB	262K	Google	419.8K
Gemma 4 12B Coder Fable5 Composer2.5 V1	12.0B	6.1 GB	262K	yuxinlu1	25.8K
Gemma 4 12B OBLITERATED	12.0B	4.3 GB	131K	OBLITERATUS	12.2K
Gemma 4 12B IT AEON Abliterated K4 BF16	12.0B	6.1 GB	262K	AEON-7	9.8K
Gemma 4 12B	12.0B	6.1 GB	262K	Google	—
Gemma 4 12B Agentic Fable5 Composer2.5 v2 3.5x Tau2	12.0B	6.1 GB	262K	yuxinlu1	—
Gemma 4 12B IT Abliterated Uncensored	12.0B	6.1 GB	131K	OpenYourMind	—
Gemma 4 12B IT Assistant	12B	5.4 GB	262K	Google	135
Gemma4 12B Mtp Assistant	12B	5.6 GB	—	sjakek	—
Gemma 4 19B	19.0B	38.7 GB	262K	0xSero	—
Gemma 4 26B A4B IT Ultra Uncensored Heretic	25.8B	11.6 GB	262K	llmfan46	31.4K
Gemma 4 26B A4B IT Uncensored	25.8B	11.6 GB	262K	TrevorJS	25.8K
Gemma 4 26B A4B IT Uncensored Heretic	25.8B	11.6 GB	262K	llmfan46	17.6K
Gemma 4 26B A4B IT Assistant	26B	11.4 GB	262K	Google	17.6K
Gemma 4 26B A4B IT DFlash	26B	11.4 GB	262K	z-lab	—
Gemma 4 26B A4B IT	26.5B	8.0 GB	262K	Google	3.3M
Gemma 4 26B A4B IT Qat Q4 0 Unquantized	26.5B	11.9 GB	262K	Google	591.4K
Gemma 4 26B A4B StyleTune v2	26.5B	53.7 GB	262K	Gryphe	—
Gemma 4 31B IT Qat Q4 0 Unquantized Assistant	31B	13.5 GB	131K	Google	5.5K
Gemma 4 31B IT Speculator.eagle3	31B	14.5 GB	—	RedHatAI	3.7K
Gemma 4 31B IT DFlash	31B	13.5 GB	262K	z-lab	—
Gemma 4 31B IT Control Vectors	31B	14.5 GB	—	gghfez	—
Gemma 4 31B IT Uncensored Heretic	31.3B	14.9 GB	262K	llmfan46	82.1K
Gemma 4 31B IT Heretic	31.3B	10.2 GB	262K	coder3101	9.5K
Gemma 4 31B IT	32.7B	10.6 GB	262K	Google	3.0M
Gemma 4 31B IT Qat Q4 0 Unquantized	32.7B	15.5 GB	262K	Google	357.8K
Gemma 4 31B IT Uncensored	32.7B	15.5 GB	262K	TrevorJS	10.9K
Gemma 4 31B StyleTune	32.7B	67.0 GB	262K	Gryphe	—
ExtGemma4 40 5B	39.5B	80.9 GB	262K	TOTORONG	—

Frequently Asked Questions

How much VRAM do I need to run a Gemma 4 model?: The smallest Gemma 4 model, Gemma 4 E2B IT Qat Mobile Transformers, runs from 1.4 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
Which Gemma 4 models can I run on a 16 GB GPU?: 41 of 47 Gemma 4 models fit in 16 GB of VRAM at some quantization, including Gemma 4 26B A4B IT, Gemma 4 31B IT, Gemma 4 12B IT.
What is the most popular Gemma 4 model to run locally?: Gemma 4 26B A4B IT is the most downloaded Gemma 4 model in local-friendly quantized formats. It runs from 8.0 GB of VRAM.