GPUs with 16–undefined GB VRAM

Browse 70 GPUs with 16–undefined GB VRAM compatible with running LLM models locally. Compare VRAM, memory bandwidth, and AI performance.

← Show all GPUs

Which GPU Do You Need for AI?

The amount of VRAM is the most important specification for running LLMs locally. Most 7B parameter models require 4–8 GB of VRAM at common quantization levels, while 70B models need 24–48 GB. Memory bandwidth determines how fast the model generates tokens — faster bandwidth means faster responses.

NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition

NVIDIA · Blackwell

1792.0 GB/s24,064 CUDA300W TDP$8,565

NVIDIA RTX PRO 6000 Blackwell Server Edition

NVIDIA · Blackwell

1597.0 GB/s24,064 CUDA600W TDP

NVIDIA RTX PRO 6000 Blackwell Workstation Edition

NVIDIA · Blackwell

1792.0 GB/s24,064 CUDA600W TDP$8,565

NVIDIA T4

NVIDIA · Turing

320.0 GB/s2,560 CUDA70W TDP

NVIDIA TITAN RTX

NVIDIA · Turing

672.0 GB/s4,608 CUDA280W TDP$2,499

NVIDIA Tesla M40 24GB

NVIDIA · Maxwell

288.0 GB/s3,072 CUDA250W TDP

NVIDIA Tesla P100 PCIe 16GB

NVIDIA · Pascal

732.0 GB/s3,584 CUDA250W TDP

NVIDIA Tesla P40

NVIDIA · Pascal

346.0 GB/s3,840 CUDA3,840 SP250W TDP

NVIDIA Tesla V100 PCIe 16GB

NVIDIA · Volta

900.0 GB/s5,120 CUDA250W TDP

NVIDIA V100 SXM2 32GB

NVIDIA · Volta

900.0 GB/s5,120 CUDA300W TDP