Mac RAM to AI model size — the reference table
- 16 GB handles Nano 4B and Lite 9B. 32 GB gets you the 27B coding models. 64 GB is where Plus 397B becomes an option.
- On disk: Nano is 2.4 GB, Core 27B is 15.1 GB, Vision 35B is 19 GB. Plus 397B wants 209 GB.
- Speed on my M1 Ultra: Nano hits 71.7 tok/s, Core 27B does 20.7, Plus 397B crawls at 2.1.
- Quick math: usable model size is about your unified RAM minus a few GB for macOS.
Your Mac's unified memory decides which local models you can actually run. The table below maps it all out: what fits in your RAM, how big each model is on disk, plus the generation speed I measured on an M1 Ultra. Use it to size up the Mac you own, or the one you're about to buy.
RAM → what you can run
| Unified RAM | Typical Mac | Models that run |
|---|---|---|
| 16 GB | MacBook Air | Nano 4B, Lite 9B |
| 24 GB | MacBook Pro | + Quick 26B, Core 27B, Code 27B, Vision 35B (tight) |
| 32 GB | MacBook Pro | 27B coding + Vision, comfortably |
| 64 GB | Mac Studio / MBP | All tiers, including Plus 397B |
| 96+ GB | Mac Studio | Plus 397B with headroom for long context |
Model size and speed (measured, M1 Ultra)
| Model | Params | Disk | Decode tok/s |
|---|---|---|---|
| Nano | 4B | 2.4 GB | 71.7 |
| Lite | 9B | 5 GB | 53.4 |
| Quick | 26B MoE | 15.6 GB | 14.6 |
| Core | 27B | 15.1 GB | 20.7 |
| Vision | 35B MoE | 19 GB | 16.3 |
| Plus | 397B MoE | 209 GB | 2.1 |
Numbers are from an M1 Ultra, MLX 4-bit, batch size 1. Newer chips (M2/M3/M4) run faster, but the ranking stays the same.
How a 209 GB model runs on a 64 GB Mac
Plus 397B is 209 GB on disk. That's way past 64 GB of RAM, and yet it runs. The trick is that it's a Mixture-of-Experts model, so only a slice of its parameters fire on any given token. Outlier's V9 paged engine holds the active experts in memory and streams the rest off the SSD, which keeps peak memory hovering around 11 GB. See how paged MoE inference works.
Frequently asked questions
What size AI model can my Mac run?
Roughly your unified RAM minus a few GB for macOS. 16 GB runs 4B–9B models, 32 GB runs 27B-class models, and 64 GB runs a 397B Mixture-of-Experts model via paged streaming.
How fast is local AI on a Mac?
On an M1 Ultra: about 71.7 tok/s for a 4B model, 20.7 for a 27B model, and 2.1 for the 397B model via paged streaming. Newer chips are faster. Speed depends mostly on model size and chip generation.
How much disk do AI models need?
From about 2.4 GB for a 4B model to 209 GB for the 397B model. Most users keep a couple of small models plus one 15 GB coding model. You only download the tiers you use.
Try Outlier free
Free Nano + Lite — local, private, no account. Pro $20/mo or $149/yr adds everything (all 7 model tiers incl. Plus 397B). Lifetime Pro from $99 (Founding 200, first 200 seats) or $200 (Founders 500). Apple Silicon only.
Download for Mac