Nemotron Models — Hardware Requirements

29 Nemotron models from NVIDIA and the community, from the smallest that runs in 1.0 GB of VRAM up to 560.5B parameters. Every row links to full quantization tables and GPU compatibility.

All Nemotron Models by Size

ModelParamsContext
OpenMath Nemotron 1.5B1.5B131K
Nemotron Flash 3B2.7B29K
Nemotron Labs Diffusion 3B3.8B262K
NVIDIA Nemotron 3 Nano 4B BF164.0B262K
Nemotron Mini 4B Instruct4B4K
Nemotron Content Safety Reasoning 4B4.3B131K
Nemotron Cascade 8B8B33K
Nemotron H 8B Reasoning 128K8.1B
Nemotron Orchestrator 8B8.2B41K
Nemotron Terminal 8B8.2B41K
Nemotron Labs Diffusion 8B8.5B262K
NVIDIA Nemotron Nano 9B v2 Japanese8.9B131K
NVIDIA Nemotron Nano 9B v28.9B131K
NVIDIA Nemotron Nano 12B v212B
Nemotron Labs Diffusion 14B13.5B262K
Nemotron Terminal 14B14.8B41K
Elbaz NVIDIA Nemotron 3 Nano 30B A3B PRISM30B
NVIDIA Nemotron 3 Nano 30B A3B BF1631.6B262K
NVIDIA Nemotron 3 Nano 30B A3B Base BF1631.6B
Nemotron Cascade 2 30B A3B31.6B262K
Nemotron Terminal 32B32.8B41K
OpenReasoning Nemotron 32B32.8B131K
OpenCodeReasoning Nemotron 1.1 32B32.8B66K
Nemotron 3 Nano Omni 30B A3B Reasoning BF1633.0B
Nemotron H 47B Reasoning 128K46.8B
NVIDIA Nemotron 3 Super 120B A12B BF16 Heretic120.7B262K
NVIDIA Nemotron 3 Super 120B A12B BF16123.6B262K
NVIDIA Nemotron 3 Super 120B A12B Base BF16123.6B1049K
NVIDIA Nemotron 3 Ultra 550B A55B BF16560.5B262K
NVIDIA Nemotron 3 Ultra 550B A55B Base BF16560.5B262K
NVIDIA Nemotron 3 Ultra 550B A55B GenRM560.5B262K

How Nemotron Compares — Benchmark Rating

NVIDIA Nemotron 3 Ultra 550B A55B BF16 is the highest-rated Nemotron model with an overall benchmark rating of 54.5/100 — #29 among 75 open models. The top proprietary model, GPT 5.5, scores 88.8. Click a model to see its full benchmark breakdown.

GPT 5.5 · proprietary88.8
Claude Opus 4.7 · proprietary87.6
Claude Fable 5 · proprietary86.6
GPT 5.4 · proprietary86.6
Claude Opus 4.8 · proprietary84.4
Composite of normalized public benchmark scores (methodology) · Nemotron · other models

Frequently Asked Questions

How much VRAM do I need to run a Nemotron model?
The smallest Nemotron model, OpenMath Nemotron 1.5B, runs from 1.0 GB of VRAM at an aggressive quantization. Larger family members need proportionally more — see the table above for every model.
Which Nemotron models can I run on a 16 GB GPU?
20 of 31 Nemotron models fit in 16 GB of VRAM at some quantization, including NVIDIA Nemotron 3 Nano 30B A3B BF16, Nemotron 3 Nano Omni 30B A3B Reasoning BF16, NVIDIA Nemotron Nano 9B v2 Japanese.
What is the most popular Nemotron model to run locally?
NVIDIA Nemotron 3 Nano 30B A3B BF16 is the most downloaded Nemotron model in local-friendly quantized formats. It runs from 9.1 GB of VRAM.
How do Nemotron models score on benchmarks?
NVIDIA Nemotron 3 Ultra 550B A55B BF16 leads the family with an overall benchmark rating of 54.5/100, ranking #29 among 75 open models, while the top proprietary model, GPT 5.5, scores 88.8. See the comparison chart above for the full standings.