GPT OSS 120B vs GPT OSS 20B Heretic

Side-by-side comparison of VRAM requirements, quantization, context length, and hardware compatibility.

GPT OSS 120B

OpenAI · 120.4B

Chat
GPT OSS 20B Heretic

p-e-w · 20.9B

Chat

Specifications

GPT OSS 120BGPT OSS 20B Heretic
Parameters120.4B20.9B
Context131K131K
ArchitectureGptOssForCausalLMGptOssForCausalLM
LicenseApache 2.0Apache 2.0
Downloads4.5M913
ReleasedAug 2025Nov 2025

VRAM by Quantization: GPT OSS 120B vs GPT OSS 20B Heretic

QuantizationBitsGPT OSS 120B VRAMGPT OSS 20B Heretic VRAM
Q2_K3.4051.6 GB
Q3_K_M3.9059.1 GB
Q3_K_S3.5053.1 GB
Q4_04.0060.6 GB
Q4_K_M4.8072.7 GB
Q5_K_M5.7086.2 GB
Q6_K6.6099.8 GB
Q8_08.00120.8 GB21.3 GB

Verdict

GPT OSS 20B Heretic needs less VRAM at Q8_0 (21.3 GB vs 120.8 GB), so it fits on smaller GPUs. GPT OSS 120B is the more widely downloaded of the two.

Frequently Asked Questions

Which needs less VRAM, GPT OSS 120B or GPT OSS 20B Heretic?

At Q8_0, GPT OSS 120B needs 120.8 GB and GPT OSS 20B Heretic needs 21.3 GB, so GPT OSS 20B Heretic is the lighter option to run locally.

Which has a longer context window, GPT OSS 120B or GPT OSS 20B Heretic?

GPT OSS 120B supports 131,072 tokens and GPT OSS 20B Heretic supports 131,072 tokens.

What is the difference between GPT OSS 120B and GPT OSS 20B Heretic?

GPT OSS 120B is a 120.4B model from OpenAI (GPT-OSS family), while GPT OSS 20B Heretic is a 20.9B model from p-e-w (GPT-OSS family). Compare their VRAM requirements above to see which fits your GPU or Mac.