OLMo Models — Hardware Requirements
7 OLMo models from Allen AI and the community, from the smallest that runs in 1.2 GB of VRAM up to 33.3B parameters. Every row links to full quantization tables and GPU compatibility.
All OLMo Models by Size
| Model | Params | Runs from | Context | Publisher | Quant downloads |
|---|---|---|---|---|---|
| OLMo 2 0425 1B | 1.5B | 1.2 GB | 4K | ||
| OLMoE 1B 7B 0125 Instruct | 6.9B | 2.5 GB | 4K | ||
| Olmo Hybrid 7B | 7B | 15.3 GB | 66K | ||
| Olmo 3 7B Instruct | 7.3B | 3.4 GB | 66K | ||
| Olmo 3 1125 32B | 32.2B | 65.3 GB | 66K | ||
| Olmo 3.1 32B Think | 32.2B | 65.3 GB | 66K | ||
| FlexOlmo 7x7B 1T | 33.3B | 67.9 GB | 4K |
Frequently Asked Questions
- How much VRAM do I need to run a OLMo model?
- The smallest OLMo model, OLMo 2 0425 1B, runs from 1.2 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
- Which OLMo models can I run on a 16 GB GPU?
- 4 of 7 OLMo models fit in 16 GB of VRAM at some quantization, including Olmo 3 7B Instruct, OLMoE 1B 7B 0125 Instruct, OLMo 2 0425 1B.
- What is the most popular OLMo model to run locally?
- Olmo 3 7B Instruct is the most downloaded OLMo model in local-friendly quantized formats. It runs from 3.4 GB of VRAM.