Question 1

How much VRAM do I need to run a Aya model?

Accepted Answer

The smallest Aya model, Aya Expanse 8B, runs from 3.0 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.

Question 2

Which Aya models can I run on a 16 GB GPU?

Accepted Answer

5 of 6 Aya models fit in 16 GB of VRAM at some quantization, including Aya Expanse 8B, Aya Expanse 8B, Tiny Aya Base.

Question 3

What is the most popular Aya model to run locally?

Accepted Answer

Aya Expanse 8B is the most downloaded Aya model in local-friendly quantized formats. It runs from 3.0 GB of VRAM.

Model	Params	Runs from	Context	Publisher	Quant downloads
Tiny Aya Base	3.3B	7.4 GB	—	Cohere	—
Tiny Aya Global	3.3B	7.4 GB	—	Cohere	—
Tiny Aya Water	3.3B	7.4 GB	—	Cohere	—
Aya Expanse 8B	8.0B	3.0 GB	—	CohereForAI	2.6K
Aya Expanse 8B	8.0B	3.8 GB	—	Cohere	—
Aya 23 8B	8.0B	17.7 GB	—	Cohere	—

Aya Models — Hardware Requirements

All Aya Models by Size

Frequently Asked Questions