Mellum Models — Hardware Requirements
7 Mellum models from JetBrains and the community, from the smallest that runs in 2.8 GB of VRAM up to 12.1B parameters. Every row links to full quantization tables and GPU compatibility.
All Mellum Models by Size
| Model | Params | Runs from | Context | Publisher | Quant downloads |
|---|---|---|---|---|---|
| Mellum 4B Base | 4.0B | 2.8 GB | 8K | ||
| Mellum2 12B A2.5B Thinking | 12.1B | 5.5 GB | 131K | ||
| Mellum2 12B A2.5B Instruct | 12.1B | 5.5 GB | 131K | ||
| Mellum2 12B A2.5B Base | 12.1B | 24.7 GB | 131K | ||
| Mellum2 12B A2.5B Thinking SFT | 12.1B | 5.5 GB | 131K | ||
| Mellum2 12B A2.5B Base Pretrain | 12.1B | 24.7 GB | 131K | ||
| Mellum2 12B A2.5B Instruct SFT | 12.1B | 5.5 GB | 131K |
Frequently Asked Questions
- How much VRAM do I need to run a Mellum model?
- The smallest Mellum model, Mellum 4B Base, runs from 2.8 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
- Which Mellum models can I run on a 16 GB GPU?
- 5 of 7 Mellum models fit in 16 GB of VRAM at some quantization, including Mellum2 12B A2.5B Thinking, Mellum2 12B A2.5B Instruct, Mellum 4B Base.
- What is the most popular Mellum model to run locally?
- Mellum2 12B A2.5B Thinking is the most downloaded Mellum model in local-friendly quantized formats. It runs from 5.5 GB of VRAM.