Question 1

How much VRAM does QUEST 35B RL need?

Accepted Answer

QUEST 35B RL requires 70.6 GB of VRAM at BF16. Full 262K context adds up to 10.7 GB (81.3 GB total).

Question 2

Can NVIDIA GeForce RTX 5090 run QUEST 35B RL?

Accepted Answer

No — QUEST 35B RL requires at least 70.6 GB at BF16, which exceeds the NVIDIA GeForce RTX 5090's 32 GB of VRAM.

Question 3

Can I run QUEST 35B RL on a Mac?

Accepted Answer

QUEST 35B RL requires at least 70.6 GB at BF16, which exceeds the unified memory of most consumer Macs. You would need a Mac Studio or Mac Pro with a high-memory configuration.

Question 4

Can I run QUEST 35B RL locally?

Accepted Answer

Yes — QUEST 35B RL can run locally on consumer hardware. At BF16 quantization it needs 70.6 GB of VRAM. Popular tools include Ollama, LM Studio, and llama.cpp.

Question 5

How fast is QUEST 35B RL?

Accepted Answer

At BF16, QUEST 35B RL can reach ~62 tok/s on AMD Instinct MI350X. Speed depends mainly on GPU memory bandwidth. Real-world results typically within ±20%.

Question 6

What's the download size of QUEST 35B RL?

Accepted Answer

At BF16, the download is about 70.21 GB.

Question 7

Which GPUs can run QUEST 35B RL?

Accepted Answer

No single consumer GPU has enough VRAM to run QUEST 35B RL at BF16 (70.6 GB). Multi-GPU or professional hardware is required.

Question 8

Which devices can run QUEST 35B RL?

Accepted Answer

19 devices with unified memory can run QUEST 35B RL at BF16 (70.6 GB), including ASUS Ascent GX10, Asus ROG Flow Z13 (2025, Ryzen AI Max+ 395, 128 GB), Beelink GTR9 Pro (Ryzen AI Max+ 395, 128 GB), Framework Desktop (Ryzen AI Max+ 395, 128 GB). Apple Silicon Macs use unified memory shared between CPU and GPU, making them well-suited for local LLM inference.

QUEST 35B RL — Hardware Requirements & GPU Compatibility

Specifications

Get Started

HuggingFace

How Much VRAM Does QUEST 35B RL Need?

Which GPUs Can Run QUEST 35B RL?

Which Devices Can Run QUEST 35B RL?

Runs great

Decent

Related Models

Frequently Asked Questions