GPUs with 12–undefined GB VRAM
Browse 47 GPUs with 12–undefined GB VRAM compatible with running LLM models locally. Compare VRAM, memory bandwidth, and AI performance.
← Show all GPUsWhich GPU Do You Need for AI?
The amount of VRAM is the most important specification for running LLMs locally. Most 7B parameter models require 4–8 GB of VRAM at common quantization levels, while 70B models need 24–48 GB. Memory bandwidth determines how fast the model generates tokens — faster bandwidth means faster responses.
GPU List
NVIDIA GeForce RTX 5070
NVIDIA · Blackwell
672.0 GB/s6,144 CUDA250W TDP$549
NVIDIA GeForce RTX 5070 Ti
NVIDIA · Blackwell
896.0 GB/s8,960 CUDA300W TDP$749
NVIDIA GeForce RTX 5080
NVIDIA · Blackwell
960.0 GB/s10,752 CUDA360W TDP$999
NVIDIA GeForce RTX 5090
NVIDIA · Blackwell
1792.0 GB/s21,760 CUDA575W TDP$1,999
NVIDIA H100 PCIe
NVIDIA · Hopper
2039.0 GB/s14,592 CUDA350W TDP
NVIDIA H100 SXM
NVIDIA · Hopper
3352.0 GB/s16,896 CUDA700W TDP
NVIDIA L4
NVIDIA · Ada Lovelace
300.0 GB/s7,424 CUDA72W TDP
NVIDIA L40
NVIDIA · Ada Lovelace
864.0 GB/s18,176 CUDA300W TDP
NVIDIA L40S
NVIDIA · Ada Lovelace
864.0 GB/s18,176 CUDA350W TDP
NVIDIA RTX 4000 Ada Generation
NVIDIA · Ada Lovelace
360.0 GB/s6,144 CUDA130W TDP$1,250
NVIDIA RTX 5000 Ada Generation
NVIDIA · Ada Lovelace
576.0 GB/s12,800 CUDA250W TDP$4,000
NVIDIA RTX 6000 Ada Generation
NVIDIA · Ada Lovelace
960.0 GB/s18,176 CUDA300W TDP$6,799
NVIDIA RTX A4000
NVIDIA · Ampere
448.0 GB/s6,144 CUDA140W TDP$1,000
NVIDIA RTX A5000
NVIDIA · Ampere
768.0 GB/s8,192 CUDA230W TDP$2,250
NVIDIA RTX A6000
NVIDIA · Ampere
768.0 GB/s10,752 CUDA300W TDP$4,649
NVIDIA T4
NVIDIA · Turing
320.0 GB/s2,560 CUDA70W TDP
NVIDIA V100 SXM2 32GB
NVIDIA · Volta
900.0 GB/s5,120 CUDA300W TDP