Meta·Llama 3

Llama 3.2 90B Vision Instruct — Hardware Requirements & GPU Compatibility

Vision

Llama 3.2 90B Vision Instruct is a 88.6B-parameter open language model from Meta in the Llama 3 family. At BF16 it needs about 194.91 GB of VRAM — see which GPUs and Macs can run it below.

1.0K downloads 358 likes

Specifications

Publisher
Meta
Family
Llama 3
Parameters
88.6B
License
llama3.2

Get Started

How Much VRAM Does Llama 3.2 90B Vision Instruct Need?

Select a quantization to see compatible GPUs below.

QuantizationBitsVRAM
BF1616.00194.9 GB

Which GPUs Can Run Llama 3.2 90B Vision Instruct?

BF16 · 194.9 GB

Llama 3.2 90B Vision Instruct (BF16) requires 194.9 GB of VRAM to load the model weights. For comfortable inference with headroom for KV cache and system overhead, 254+ GB is recommended. No single GPU has enough memory — multi-GPU or cluster setups are needed.

Which Devices Can Run Llama 3.2 90B Vision Instruct?

BF16 · 194.9 GB

2 devices with unified memory can run Llama 3.2 90B Vision Instruct, including NVIDIA DGX H100, NVIDIA DGX A100 640GB.

Benchmarks

View all 4

Related Models

Frequently Asked Questions

How much VRAM does Llama 3.2 90B Vision Instruct need?

Llama 3.2 90B Vision Instruct requires 194.9 GB of VRAM at BF16.

VRAM = Weights + KV Cache + Overhead

Weights = 88.6B × 16 bits ÷ 8 = 177.2 GB

KV Cache + Overhead 17.7 GB (at 2K context + ~0.3 GB framework)

VRAM usage by quantization

194.9 GB

Learn more about VRAM estimation →

Can NVIDIA GeForce RTX 5090 run Llama 3.2 90B Vision Instruct?

No — Llama 3.2 90B Vision Instruct requires at least 194.9 GB at BF16, which exceeds the NVIDIA GeForce RTX 5090's 32 GB of VRAM.

Can I run Llama 3.2 90B Vision Instruct on a Mac?

Llama 3.2 90B Vision Instruct requires at least 194.9 GB at BF16, which exceeds the unified memory of most consumer Macs. You would need a Mac Studio or Mac Pro with a high-memory configuration.

Can I run Llama 3.2 90B Vision Instruct locally?

Yes — Llama 3.2 90B Vision Instruct can run locally on consumer hardware. At BF16 quantization it needs 194.9 GB of VRAM. Popular tools include Ollama, LM Studio, and llama.cpp.

What's the download size of Llama 3.2 90B Vision Instruct?

At BF16, the download is about 177.19 GB.

Which GPUs can run Llama 3.2 90B Vision Instruct?

No single consumer GPU has enough VRAM to run Llama 3.2 90B Vision Instruct at BF16 (194.9 GB). Multi-GPU or professional hardware is required.

Which devices can run Llama 3.2 90B Vision Instruct?

2 devices with unified memory can run Llama 3.2 90B Vision Instruct at BF16 (194.9 GB), including NVIDIA DGX A100 640GB, NVIDIA DGX H100. Apple Silicon Macs use unified memory shared between CPU and GPU, making them well-suited for local LLM inference.