OLMo Models — Hardware Requirements

7 OLMo models from Allen AI and the community, from the smallest that runs in 1.2 GB of VRAM up to 33.3B parameters. Every row links to full quantization tables and GPU compatibility.

All OLMo Models by Size

ModelParamsContext
OLMo 2 0425 1B1.5B4K
OLMoE 1B 7B 0125 Instruct6.9B4K
Olmo Hybrid 7B7B66K
Olmo 3 7B Instruct7.3B66K
Olmo 3 1125 32B32.2B66K
Olmo 3.1 32B Think32.2B66K
FlexOlmo 7x7B 1T33.3B4K

Frequently Asked Questions

How much VRAM do I need to run a OLMo model?
The smallest OLMo model, OLMo 2 0425 1B, runs from 1.2 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
Which OLMo models can I run on a 16 GB GPU?
4 of 7 OLMo models fit in 16 GB of VRAM at some quantization, including Olmo 3 7B Instruct, OLMoE 1B 7B 0125 Instruct, OLMo 2 0425 1B.
What is the most popular OLMo model to run locally?
Olmo 3 7B Instruct is the most downloaded OLMo model in local-friendly quantized formats. It runs from 3.4 GB of VRAM.