Which needs less VRAM, Llama 3.1 70B or Hermes 2 Theta Llama 3 70B?

At Q4_K_M, Llama 3.1 70B needs 46.6 GB and Hermes 2 Theta Llama 3 70B needs 43.3 GB, so Hermes 2 Theta Llama 3 70B is the lighter option to run locally.

What is the difference between Llama 3.1 70B and Hermes 2 Theta Llama 3 70B?

Llama 3.1 70B is a 70.6B model from Meta (Llama 3 family), while Hermes 2 Theta Llama 3 70B is a 70.6B model from Nous Research (Llama 3 family). Compare their VRAM requirements above to see which fits your GPU or Mac.

Llama 3.1 70B vs Hermes 2 Theta Llama 3 70B

Side-by-side comparison of VRAM requirements, quantization, context length, and hardware compatibility.

Llama 3.1 70B

Meta · 70.6B

Chat

Hermes 2 Theta Llama 3 70B

Nous Research · 70.6B

Chat

Specifications

	Llama 3.1 70B	Hermes 2 Theta Llama 3 70B
Parameters	70.6B	70.6B
Context	—	8K
Architecture	—	LlamaForCausalLM
License	Llama 3.1 Community	Llama 3 Community
Downloads	90.8K	1.3K
Released	Sep 2024	—

VRAM by Quantization: Llama 3.1 70B vs Hermes 2 Theta Llama 3 70B

Quantization	Bits	Llama 3.1 70B VRAM	Hermes 2 Theta Llama 3 70B VRAM
Q2_K	3.40	33.0 GB	31.0 GB
Q3_K_M	3.90	37.8 GB	35.4 GB
Q3_K_S	3.50	34.0 GB	31.8 GB
Q4_0	4.00	—	36.3 GB
Q4_K_M	4.80	46.6 GB	43.3 GB
Q5_K_M	5.70	55.3 GB	51.2 GB
Q6_K	6.60	64.0 GB	59.2 GB
Q8_0	8.00	77.6 GB	71.5 GB

Verdict

Hermes 2 Theta Llama 3 70B needs less VRAM at Q4_K_M (43.3 GB vs 46.6 GB), so it fits on smaller GPUs. Llama 3.1 70B is the more widely downloaded of the two.

Frequently Asked Questions

Which needs less VRAM, Llama 3.1 70B or Hermes 2 Theta Llama 3 70B?: At Q4_K_M, Llama 3.1 70B needs 46.6 GB and Hermes 2 Theta Llama 3 70B needs 43.3 GB, so Hermes 2 Theta Llama 3 70B is the lighter option to run locally.
What is the difference between Llama 3.1 70B and Hermes 2 Theta Llama 3 70B?: Llama 3.1 70B is a 70.6B model from Meta (Llama 3 family), while Hermes 2 Theta Llama 3 70B is a 70.6B model from Nous Research (Llama 3 family). Compare their VRAM requirements above to see which fits your GPU or Mac.