Question 1

How much VRAM do I need to run a Yi model?

Accepted Answer

The smallest Yi model, Yi 6B, runs from 2.9 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.

Question 2

Which Yi models can I run on a 16 GB GPU?

Accepted Answer

5 of 5 Yi models fit in 16 GB of VRAM at some quantization, including Yi 34B Chat, Yi 6B, Yi 34B.

Question 3

What is the most popular Yi model to run locally?

Accepted Answer

Yi 34B Chat is the most downloaded Yi model in local-friendly quantized formats. It runs from 15.0 GB of VRAM.

Question 4

How do Yi models score on benchmarks?

Accepted Answer

Yi 34B leads the family with an overall benchmark rating of 58.0/100, ranking #18 among 73 open models, while the top proprietary model, Claude Fable 5 Max, scores 89.9. See the comparison chart above for the full standings.

Model	Params	Runs from	Context	Publisher	Quant downloads
Yi 6B	6.1B	2.9 GB	4K	01.AI	675
Yi 6B Chat	6.1B	2.9 GB	4K	01.AI	281
Yi 9B	8.8B	4.1 GB	4K	01.AI	436
Yi 34B Chat	34.4B	15.0 GB	4K	01.AI	1.4K
Yi 34B	34.4B	15.0 GB	4K	01.AI	591

Yi Models — Hardware Requirements

All Yi Models by Size

How Yi Compares — Benchmark Rating

Frequently Asked Questions