DeepSeek·DeepSeek V3·DeepseekV32ForCausalLM

DeepSeek V3.2 — Hardware Requirements & GPU Compatibility

Chat

DeepSeek V3.2 is the latest iteration of DeepSeek's general-purpose flagship, building on the V3 architecture with 685.4 billion total parameters in a mixture-of-experts configuration. This update refines the model's conversational abilities, instruction following, and multilingual performance compared to earlier V3 releases. Running V3.2 locally requires significant GPU resources due to the large total parameter count, though the MoE design means only a subset of parameters are active for any given token. Users with multi-GPU workstations or servers can run quantized versions effectively, making this one of the most powerful open-weight chat models available for self-hosted deployment.

273.6K downloads 1.3K likesDec 2025164K context

Specifications

Publisher
DeepSeek
Family
DeepSeek V3
Parameters
685.4B
Architecture
DeepseekV32ForCausalLM
Context Length
163,840 tokens
Vocabulary Size
129,280
Release Date
2025-12-01
License
MIT

Get Started

How Much VRAM Does DeepSeek V3.2 Need?

Select a quantization to see compatible GPUs below.

QuantizationBitsVRAM
IQ2_XXS2.20192.4 GB
IQ2_M2.70235.2 GB
IQ3_XXS3.10269.5 GB
Q2_K3.40295.2 GB
Q3_K_S3.50303.7 GB
Q3_K_M3.90338.0 GB
Q4_04.00346.6 GB
IQ4_XS4.30372.3 GB
Q4_14.50389.4 GB
Q4_K_S4.50389.4 GB
IQ4_NL4.50389.4 GB
Q4_K_M4.80415.1 GB
Q5_K_S5.50475.1 GB
Q5_K_M5.70492.2 GB
Q6_K6.60569.3 GB
Q8_08.00689.3 GB

Which GPUs Can Run DeepSeek V3.2?

Q4_K_M · 415.1 GB

DeepSeek V3.2 (Q4_K_M) requires 415.1 GB of VRAM to load the model weights. For comfortable inference with headroom for KV cache and system overhead, 540+ GB is recommended. Using the full 164K context window can add up to 283.0 GB, bringing total usage to 698.1 GB. No single GPU has enough memory — multi-GPU or cluster setups are needed.

Which Devices Can Run DeepSeek V3.2?

Q4_K_M · 415.1 GB

2 devices with unified memory can run DeepSeek V3.2, including NVIDIA DGX H100, NVIDIA DGX A100 640GB.

Related Models

Frequently Asked Questions

How much VRAM does DeepSeek V3.2 need?

DeepSeek V3.2 requires 415.1 GB of VRAM at Q4_K_M, or 689.3 GB at Q8_0. Full 164K context adds up to 283.0 GB (698.1 GB total).

VRAM = Weights + KV Cache + Overhead

Weights = 685.4B × 4.8 bits ÷ 8 = 411.2 GB

KV Cache + Overhead 3.9 GB (at 2K context + ~0.3 GB framework)

KV Cache + Overhead 286.9 GB (at full 164K context)

VRAM usage by quantization

415.1 GB
698.1 GB

Learn more about VRAM estimation →

Can NVIDIA GeForce RTX 5090 run DeepSeek V3.2?

No — DeepSeek V3.2 requires at least 192.4 GB at IQ2_XXS, which exceeds the NVIDIA GeForce RTX 5090's 32 GB of VRAM.

What's the best quantization for DeepSeek V3.2?

For DeepSeek V3.2, Q4_K_M (415.1 GB) offers the best balance of quality and VRAM usage. Q5_K_S (475.1 GB) provides better quality if you have the VRAM. The smallest option is IQ2_XXS at 192.4 GB.

VRAM requirement by quantization

IQ2_XXS
192.4 GB
Q3_K_S
303.7 GB
Q4_1
389.4 GB
Q4_K_M
415.1 GB
Q5_K_S
475.1 GB
Q8_0
689.3 GB

★ Recommended — best balance of quality and VRAM usage.

Learn more about quantization →

Can I run DeepSeek V3.2 on a Mac?

DeepSeek V3.2 requires at least 192.4 GB at IQ2_XXS, which exceeds the unified memory of most consumer Macs. You would need a Mac Studio or Mac Pro with a high-memory configuration.

Can I run DeepSeek V3.2 locally?

Yes — DeepSeek V3.2 can run locally on consumer hardware. At Q4_K_M quantization it needs 415.1 GB of VRAM. Popular tools include Ollama, LM Studio, and llama.cpp.

What's the download size of DeepSeek V3.2?

At Q4_K_M, the download is about 411.24 GB. The full-precision Q8_0 version is 685.40 GB. The smallest option (IQ2_XXS) is 188.48 GB.