GPUs with 40–undefined GB VRAM

Browse 13 GPUs with 40–undefined GB VRAM compatible with running LLM models locally. Compare VRAM, memory bandwidth, and AI performance.

← Show all GPUs

Which GPU Do You Need for AI?

The amount of VRAM is the most important specification for running LLMs locally. Most 7B parameter models require 4–8 GB of VRAM at common quantization levels, while 70B models need 24–48 GB. Memory bandwidth determines how fast the model generates tokens — faster bandwidth means faster responses.

AMD Instinct MI210

AMD · CDNA 2

1638.4 GB/s6,656 SP300W TDP

AMD Instinct MI250X

AMD · CDNA 2

3276.8 GB/s14,080 SP560W TDP

AMD Instinct MI300X

AMD · CDNA 3

5300.0 GB/s19,456 SP750W TDP

AMD Radeon PRO W7900

AMD · RDNA 3

864.0 GB/s6,144 SP295W TDP$3,999

NVIDIA A100 40GB PCIe

NVIDIA · Ampere

1555.0 GB/s6,912 CUDA250W TDP

NVIDIA A100 80GB SXM

NVIDIA · Ampere

2039.0 GB/s6,912 CUDA400W TDP

NVIDIA A40

NVIDIA · Ampere

696.0 GB/s10,752 CUDA300W TDP

NVIDIA H100 PCIe

NVIDIA · Hopper

2039.0 GB/s14,592 CUDA350W TDP

NVIDIA H100 SXM

NVIDIA · Hopper

3352.0 GB/s16,896 CUDA700W TDP

NVIDIA L40

NVIDIA · Ada Lovelace

864.0 GB/s18,176 CUDA300W TDP

NVIDIA L40S

NVIDIA · Ada Lovelace

864.0 GB/s18,176 CUDA350W TDP

NVIDIA RTX 6000 Ada Generation

NVIDIA · Ada Lovelace

960.0 GB/s18,176 CUDA300W TDP$6,799

NVIDIA RTX A6000

NVIDIA · Ampere

768.0 GB/s10,752 CUDA300W TDP$4,649