Nemotron Models — Hardware Requirements
29 Nemotron models from NVIDIA and the community, from the smallest that runs in 1.0 GB of VRAM up to 560.5B parameters. Every row links to full quantization tables and GPU compatibility.
All Nemotron Models by Size
How Nemotron Compares — Benchmark Rating
NVIDIA Nemotron 3 Ultra 550B A55B BF16 is the highest-rated Nemotron model with an overall benchmark rating of 54.5/100 — #29 among 75 open models. The top proprietary model, GPT 5.5, scores 88.8. Click a model to see its full benchmark breakdown.
GPT 5.5 · proprietary88.8
Claude Opus 4.7 · proprietary87.6
Claude Fable 5 · proprietary86.6
GPT 5.4 · proprietary86.6
Claude Opus 4.8 · proprietary84.4
DeepSeek V4 Pro77.5
Qwen3.6 27B74.0
StableBeluga269.1
MiniMax M2.768.4
Frequently Asked Questions
- How much VRAM do I need to run a Nemotron model?
- The smallest Nemotron model, OpenMath Nemotron 1.5B, runs from 1.0 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
- Which Nemotron models can I run on a 16 GB GPU?
- 20 of 31 Nemotron models fit in 16 GB of VRAM at some quantization, including NVIDIA Nemotron 3 Nano 30B A3B BF16, Nemotron 3 Nano Omni 30B A3B Reasoning BF16, NVIDIA Nemotron Nano 9B v2 Japanese.
- What is the most popular Nemotron model to run locally?
- NVIDIA Nemotron 3 Nano 30B A3B BF16 is the most downloaded Nemotron model in local-friendly quantized formats. It runs from 9.1 GB of VRAM.
- How do Nemotron models score on benchmarks?
- NVIDIA Nemotron 3 Ultra 550B A55B BF16 leads the family with an overall benchmark rating of 54.5/100, ranking #29 among 75 open models, while the top proprietary model, GPT 5.5, scores 88.8. See the comparison chart above for the full standings.