GLM 4.6 Derestricted v3 vs GLM 4 9B 0414

Side-by-side comparison of VRAM requirements, quantization, context length, and hardware compatibility.

GLM 4.6 Derestricted v3

ArliAI · 356.8B

Chat
GLM 4 9B 0414

zai-org · 9.4B

Chat

Specifications

GLM 4.6 Derestricted v3GLM 4 9B 0414
Parameters356.8B9.4B
Context203K33K
ArchitectureGlm4MoeForCausalLMGlm4ForCausalLM
LicenseMITMIT
Downloads1.3K14.7K
ReleasedDec 2025Apr 2025

VRAM by Quantization: GLM 4.6 Derestricted v3 vs GLM 4 9B 0414

QuantizationBitsGLM 4.6 Derestricted v3 VRAMGLM 4 9B 0414 VRAM
Q2_K3.404.4 GB
Q3_K_M3.905.0 GB
Q3_K_S3.504.5 GB
Q4_04.005.1 GB
Q4_K_M4.806.0 GB
Q5_K_M5.707.1 GB
Q6_K6.608.1 GB
Q8_08.009.8 GB

Verdict

GLM 4.6 Derestricted v3 supports a longer context window (203K tokens). GLM 4 9B 0414 is the more widely downloaded of the two.

Frequently Asked Questions

Which has a longer context window, GLM 4.6 Derestricted v3 or GLM 4 9B 0414?

GLM 4.6 Derestricted v3 supports 202,752 tokens and GLM 4 9B 0414 supports 32,768 tokens.

What is the difference between GLM 4.6 Derestricted v3 and GLM 4 9B 0414?

GLM 4.6 Derestricted v3 is a 356.8B model from ArliAI (GLM family), while GLM 4 9B 0414 is a 9.4B model from zai-org (GLM family). Compare their VRAM requirements above to see which fits your GPU or Mac.