Macaron V1 Preview 749B — Hardware Requirements & GPU Compatibility
ChatFunctionsMacaron V1 Preview 749B is a 753.9B-parameter open language model from mindlab-research. It supports a context window of up to 202,752 tokens. At BF16 it needs about 1511.95 GB of VRAM — see which GPUs and Macs can run it below.
Specifications
- Publisher
- mindlab-research
- Parameters
- 753.9B
- Architecture
- GlmMoeDsaForCausalLM
- Context Length
- 202,752 tokens
- Vocabulary Size
- 154,880
- Release Date
- 2026-06-07
- License
- MIT
Get Started
HuggingFace
How Much VRAM Does Macaron V1 Preview 749B Need?
Select a quantization to see compatible GPUs below.
| Quantization | Bits | VRAM | + Context | File Size | Quality |
|---|---|---|---|---|---|
| BF16 | 16.00 | 1512.0 GB | 1896.7 GB | 1507.73 GB | Brain floating point 16 — preferred for training |
Which GPUs Can Run Macaron V1 Preview 749B?
BF16 · 1512.0 GBMacaron V1 Preview 749B (BF16) requires 1512.0 GB of VRAM to load the model weights. For comfortable inference with headroom for KV cache and system overhead, 1966+ GB is recommended. Using the full 203K context window can add up to 384.7 GB, bringing total usage to 1896.7 GB. No single GPU has enough memory — multi-GPU or cluster setups are needed.
Related Models
Frequently Asked Questions
- How much VRAM does Macaron V1 Preview 749B need?
Macaron V1 Preview 749B requires 1512.0 GB of VRAM at BF16. Full 203K context adds up to 384.7 GB (1896.7 GB total).
VRAM = Weights + KV Cache + Overhead
Weights = 753.9B × 16 bits ÷ 8 = 1507.7 GB
KV Cache + Overhead ≈ 4.3 GB (at 2K context + ~0.3 GB framework)
KV Cache + Overhead ≈ 389 GB (at full 203K context)
VRAM usage by quantization
BF161512.0 GBBF16 + full context1896.7 GB- Can NVIDIA GeForce RTX 5090 run Macaron V1 Preview 749B?
No — Macaron V1 Preview 749B requires at least 1512.0 GB at BF16, which exceeds the NVIDIA GeForce RTX 5090's 32 GB of VRAM.
- Can I run Macaron V1 Preview 749B on a Mac?
Macaron V1 Preview 749B requires at least 1512.0 GB at BF16, which exceeds the unified memory of most consumer Macs. You would need a Mac Studio or Mac Pro with a high-memory configuration.
- Can I run Macaron V1 Preview 749B locally?
Yes — Macaron V1 Preview 749B can run locally on consumer hardware. At BF16 quantization it needs 1512.0 GB of VRAM. Popular tools include Ollama, LM Studio, and llama.cpp.
- What's the download size of Macaron V1 Preview 749B?
At BF16, the download is about 1507.73 GB.
- Which GPUs can run Macaron V1 Preview 749B?
No single consumer GPU has enough VRAM to run Macaron V1 Preview 749B at BF16 (1512.0 GB). Multi-GPU or professional hardware is required.
- Which devices can run Macaron V1 Preview 749B?
Macaron V1 Preview 749B requires at least 1512.0 GB at BF16, which exceeds the unified memory of most consumer devices. A high-memory Mac Studio, Mac Pro, or multi-GPU desktop setup is recommended.