SmolLM Models — Hardware Requirements

13 SmolLM models from Hugging Face and the community, from the smallest that runs in 0.4 GB of VRAM up to 3.1B parameters. Every row links to full quantization tables and GPU compatibility.

All SmolLM Models by Size

ModelParamsContext
SmolLM2 70M69M8K
SmolLM2 135M Instruct135M8K
SmolLM 135M135M2K
SmolLM2 135M135M8K
SmolLM2 360M Instruct362M8K
SmolLM2 360M362M8K
SmolLM 360M Instruct362M2K
SmolLM2 1.7B Instruct1.7B8K
SmolLM2 1.7B1.7B8K
SmolLM 1.7B1.7B2K
SmolLM3 3B Base3B66K
SmolLM3 3B ONNX3B66K
SmolLM3 3B3.1B66K

Frequently Asked Questions

How much VRAM do I need to run a SmolLM model?
The smallest SmolLM model, SmolLM2 70M, runs from 0.4 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
Which SmolLM models can I run on a 16 GB GPU?
13 of 13 SmolLM models fit in 16 GB of VRAM at some quantization, including SmolLM2 135M Instruct, SmolLM2 1.7B Instruct, SmolLM2 360M Instruct.
What is the most popular SmolLM model to run locally?
SmolLM2 135M Instruct is the most downloaded SmolLM model in local-friendly quantized formats. It runs from 0.4 GB of VRAM.