All LLM Models

Browse 32 LLM models with VRAM requirements, quantization options, and hardware compatibility.

Featured only

Understanding LLM VRAM Requirements

How much VRAM you need depends on the model size and quantization level. Quantization reduces the precision of model weights, trading small quality losses for significantly lower VRAM usage. For example, a 7B parameter model needs ~14 GB at FP16 but only ~4 GB at Q4_K_M quantization.

DeepSeek v2

DeepSeek · 235.7B · runs from 103.0 GB

5.2K 334

DeepSeek v2 is a 235.7B-parameter open language model from DeepSeek in the DeepSeek V2 family. It supports a context window of up to 163,840 tokens. See its VRAM requirements by quantization and which GPUs and Macs can run it locally below.

Chat

DeepSeek Coder v2 Lite Base

DeepSeek · 15.7B · runs from 7.4 GB

4.8K 105

DeepSeek Coder v2 Lite Base is a 15.7B-parameter open language model from DeepSeek in the DeepSeek Coder family. It supports a context window of up to 163,840 tokens. See its VRAM requirements by quantization and which GPUs and Macs can run it locally below.

ChatCode

All LLM Models

Understanding LLM VRAM Requirements

Model List

DeepSeek v2

DeepSeek Coder v2 Lite Base