Question 1

How much VRAM do I need to run a InternLM model?

Accepted Answer

The smallest InternLM model, Internlm3 8B Instruct, runs from 3.4 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.

Question 2

Which InternLM models can I run on a 16 GB GPU?

Accepted Answer

5 of 6 InternLM models fit in 16 GB of VRAM at some quantization, including Internlm3 8B Instruct, Internlm2 5 7B Chat, Internlm2 5 20B Chat.

Question 3

What is the most popular InternLM model to run locally?

Accepted Answer

Internlm3 8B Instruct is the most downloaded InternLM model in local-friendly quantized formats. It runs from 3.4 GB of VRAM.

Question 4

How do InternLM models score on benchmarks?

Accepted Answer

Internlm 20B leads the family with an overall benchmark rating of 62.1/100, ranking #15 among 73 open models, while the top proprietary model, Claude Fable 5 Max, scores 89.9. See the comparison chart above for the full standings.

Model	Params	Runs from	Context	Publisher	Quant downloads
Internlm2 5 7B Chat	7B	3.5 GB	33K	InternLM	—
Internlm 7B	7B	15.4 GB	2K	InternLM	—
Internlm3 8B Instruct	8.8B	3.4 GB	33K	InternLM	1.5K
Internlm2 5 20B Chat	19.9B	9.1 GB	33K	InternLM	—
Internlm 20B	20B	42.8 GB	4K	InternLM	—
Internlm Chat 20B	20B	11.3 GB	4K	InternLM	—

InternLM Models — Hardware Requirements

All InternLM Models by Size

How InternLM Compares — Benchmark Rating

Frequently Asked Questions