Question 1

Which needs less VRAM, Llama 3.3 70B Instruct Abliterated or Llama 3.3 70B Instruct?

Accepted Answer

At Q4_K_M, Llama 3.3 70B Instruct Abliterated needs 43.3 GB and Llama 3.3 70B Instruct needs 46.6 GB, so Llama 3.3 70B Instruct Abliterated is the lighter option to run locally.

Question 2

Which has a longer context window, Llama 3.3 70B Instruct Abliterated or Llama 3.3 70B Instruct?

Accepted Answer

Llama 3.3 70B Instruct Abliterated supports 131,072 tokens and Llama 3.3 70B Instruct supports 131,072 tokens.

Question 3

What is the difference between Llama 3.3 70B Instruct Abliterated and Llama 3.3 70B Instruct?

Accepted Answer

Llama 3.3 70B Instruct Abliterated is a 70.6B model from huihui-ai (Llama 3 family), while Llama 3.3 70B Instruct is a 70.6B model from Meta (Llama 3 family). Compare their VRAM requirements above to see which fits your GPU or Mac.

	Llama 3.3 70B Instruct Abliterated	Llama 3.3 70B Instruct
Parameters	70.6B	70.6B
Context	131K	131K
Architecture	LlamaForCausalLM	—
License	llama3.3	llama3.3
Downloads	4.3K	802.3K
Released	Dec 2024	Dec 2024

Quantization	Bits	Llama 3.3 70B Instruct Abliterated VRAM	Llama 3.3 70B Instruct VRAM
Q2_K	3.40	31.0 GB	33.0 GB
Q3_K_M	3.90	35.4 GB	37.8 GB
Q3_K_S	3.50	31.8 GB	34.0 GB
Q4_0	4.00	36.3 GB	38.8 GB
Q4_K_M	4.80	43.3 GB	46.6 GB
Q5_K_M	5.70	51.2 GB	55.3 GB
Q6_K	6.60	59.2 GB	64.0 GB
Q8_0	8.00	71.5 GB	77.6 GB

Llama 3.3 70B Instruct Abliterated vs Llama 3.3 70B Instruct

Specifications

VRAM by Quantization: Llama 3.3 70B Instruct Abliterated vs Llama 3.3 70B Instruct

Verdict

Frequently Asked Questions