mindlab-research·GlmMoeDsaForCausalLM

Macaron V1 Preview 749B — Hardware Requirements & GPU Compatibility

ChatFunctions

Macaron V1 Preview 749B is a 753.9B-parameter open language model from mindlab-research. It supports a context window of up to 202,752 tokens. At BF16 it needs about 1511.95 GB of VRAM — see which GPUs and Macs can run it below.

998 downloads 25 likes203K context
Based on GLM 5.1

Specifications

Publisher
mindlab-research
Parameters
753.9B
Architecture
GlmMoeDsaForCausalLM
Context Length
202,752 tokens
Vocabulary Size
154,880
Release Date
2026-06-07
License
MIT

Get Started

How Much VRAM Does Macaron V1 Preview 749B Need?

Select a quantization to see compatible GPUs below.

QuantizationBitsVRAM
BF1616.001512.0 GB

Which GPUs Can Run Macaron V1 Preview 749B?

BF16 · 1512.0 GB

Macaron V1 Preview 749B (BF16) requires 1512.0 GB of VRAM to load the model weights. For comfortable inference with headroom for KV cache and system overhead, 1966+ GB is recommended. Using the full 203K context window can add up to 384.7 GB, bringing total usage to 1896.7 GB. No single GPU has enough memory — multi-GPU or cluster setups are needed.

Related Models

Frequently Asked Questions

How much VRAM does Macaron V1 Preview 749B need?

Macaron V1 Preview 749B requires 1512.0 GB of VRAM at BF16. Full 203K context adds up to 384.7 GB (1896.7 GB total).

VRAM = Weights + KV Cache + Overhead

Weights = 753.9B × 16 bits ÷ 8 = 1507.7 GB

KV Cache + Overhead 4.3 GB (at 2K context + ~0.3 GB framework)

KV Cache + Overhead 389 GB (at full 203K context)

VRAM usage by quantization

1512.0 GB
1896.7 GB

Learn more about VRAM estimation →

Can NVIDIA GeForce RTX 5090 run Macaron V1 Preview 749B?

No — Macaron V1 Preview 749B requires at least 1512.0 GB at BF16, which exceeds the NVIDIA GeForce RTX 5090's 32 GB of VRAM.

Can I run Macaron V1 Preview 749B on a Mac?

Macaron V1 Preview 749B requires at least 1512.0 GB at BF16, which exceeds the unified memory of most consumer Macs. You would need a Mac Studio or Mac Pro with a high-memory configuration.

Can I run Macaron V1 Preview 749B locally?

Yes — Macaron V1 Preview 749B can run locally on consumer hardware. At BF16 quantization it needs 1512.0 GB of VRAM. Popular tools include Ollama, LM Studio, and llama.cpp.

What's the download size of Macaron V1 Preview 749B?

At BF16, the download is about 1507.73 GB.

Which GPUs can run Macaron V1 Preview 749B?

No single consumer GPU has enough VRAM to run Macaron V1 Preview 749B at BF16 (1512.0 GB). Multi-GPU or professional hardware is required.

Which devices can run Macaron V1 Preview 749B?

Macaron V1 Preview 749B requires at least 1512.0 GB at BF16, which exceeds the unified memory of most consumer devices. A high-memory Mac Studio, Mac Pro, or multi-GPU desktop setup is recommended.