Outlier  ›  Learn  ›  How to run AI on 8GB Mac

How to run AI on a Mac with 8GB of RAM

Quick answer

An 8GB Mac runs Outlier Nano (1.5B) at 32 tok/s and Lite (3B) smoothly. 7B models are tight but workable if you close other apps. Anything above 7B needs 16GB or more.

The M4 MacBook Air ships with 8GB by default — it's the entry config Apple sells to most people, and it was the standard for M1 Airs too. Plenty of buyers wonder whether that's enough for local AI, or whether they made a mistake skipping the 16GB upgrade. Short answer: 8GB works, within clear limits. Here's exactly what those limits are.

The RAM math: what's actually free

On an Apple Silicon Mac, the CPU, GPU, and Neural Engine all share one unified memory pool. That's good news for AI performance — the model doesn't have to copy data across a PCIe bus. The less good news: macOS itself takes a bite before your model ever loads.

With a typical app load (browser, a few tabs, maybe a notes app), macOS and your running apps consume roughly 2–4GB of that 8GB. That leaves somewhere between 4 and 6GB available for a model. Some models fit easily in that window. Others don't fit at all.

The number that matters for local AI is wired memory — the portion the GPU holds and won't swap out. At 4-bit quantization (the standard compression format for running models locally), wired usage is much lower than the raw parameter count would suggest, which is why 7B-parameter models don't actually need 7GB.

Which model sizes fit on 8GB

Here's how common model sizes map to real wired memory usage at 4-bit quantization, and whether they're practical on an 8GB machine:

Model size Wired RAM (4-bit) Min RAM recommended Notes
1.5B (Outlier Nano) ~1.2GB 8GB Runs comfortably with room to spare
3B (Outlier Lite) ~2.0GB 8GB Runs fine; leaves headroom for other apps
7B ~4.5GB 8GB Tight — close Chrome and heavy apps first
13B ~8–9GB 16GB Too large; will thrash swap on 8GB
27B (Outlier Core) ~17GB 24GB+ Requires a machine with substantially more RAM

The cutoff is real: if a model's wired footprint exceeds available free memory, macOS falls back to swapping — writing model weights to the SSD and reading them back on demand. That slows inference to roughly 1–3 tok/s, causes the fan to spin up, and generates noticeable heat. It's not a crash, but it's not usable either. The fix is staying within your memory budget, not fighting it.

What Nano and Lite are actually good for

Small doesn't mean useless. Nano (1.5B) and Lite (3B) handle a wide range of everyday tasks well:

Where Nano and Lite will show their limits: complex multi-step coding problems, deep reasoning chains, or tasks that need the model to hold a lot of context and think carefully. For those, a 27B or larger model makes a noticeable difference — but that's a 16GB or 24GB conversation, not an 8GB one. For daily writing, research, and quick coding help, the small tiers are genuinely capable.

Pushing it with 7B: what to close, what to expect

A 7B model at 4-bit quantization uses around 4.5GB wired. That fits on an 8GB Mac — barely. Whether it runs well depends entirely on what else you have open.

Before loading a 7B model on an 8GB machine, quit (not minimize — actually quit) the apps eating memory:

With a clean slate, you should have 5–6GB free, which is enough margin for a 7B model to load without hitting swap. Inference won't be as snappy as Nano — expect something closer to 10–15 tok/s depending on context length — but it will run, and the quality step-up over Lite is real for tasks that need it.

The moment macOS starts swapping — you'll notice the fan, the heat, and responses trickling in one token at a time — step back down to Nano or Lite. No model is worth that.

When you'll want more RAM

8GB is a workable machine for local AI. It's not a fully unconstrained one. Here's when the 16GB upgrade genuinely earns its cost:

If you're primarily using AI for chat, writing, and Q&A — and you're happy with Nano and Lite — 8GB is enough and the $200 upgrade isn't necessary. If you want to run 7B routinely without thinking about it, or if you're curious about larger models, 16GB removes the friction. It's a real trade-off, not a marketing upsell.

Step by step: running AI on your 8GB Mac with Outlier

  1. Download Outlier from outlier.host. It's a signed Mac app — no terminal, no Python, no account required.
  2. Open Outlier. On first launch, it will download the Nano model. The file is small (around 1.2GB) and downloads once.
  3. Start chatting. Nano loads fast and responds at 32 tok/s on M4 Air — faster than typical reading pace. Try a summary, a draft, a question.
  4. Switch to Lite when you want more depth on a task. The 3B model uses ~2GB wired and is included free alongside Nano.
  5. To try 7B, quit your heavy background apps first (see above), then select a 7B model in Outlier's model picker. Watch Activity Monitor → Memory if you want to verify you're not swapping.

Everything runs on your chip, stays on your disk, and never touches the internet. The Wi-Fi off, airplane-mode test works out of the box.

Receipts: Nano speed (32 tok/s) measured on M4 MacBook Air 8GB running Outlier. Lite speed (~18–20 tok/s) measured on the same machine. Wired RAM figures are approximate at 4-bit quantization; actual usage varies by model architecture and context length.

Download Outlier — free

Nano and Lite are free, local, and private. No account. No subscription. Pro adds larger models for $20/mo or $149/yr. Apple Silicon only.

Download for Mac