Outlier › Learn › How to run AI on 8GB Mac

How to run AI on a Mac with 8GB of RAM

Outlier · solo-built in Grand Rapids · published 2026-06-19 Last updated 2026-06-19

Quick answer

An 8GB Mac runs Outlier Nano (1.5B) at 32 tok/s and Lite (3B) smoothly. 7B models are tight but workable if you close other apps. Anything above 7B needs 16GB or more.

The M4 MacBook Air ships with 8GB by default — it's the entry config Apple sells to most people, and it was the standard for M1 Airs too. Plenty of buyers wonder whether that's enough for local AI, or whether they made a mistake skipping the 16GB upgrade. Short answer: 8GB works, within clear limits. Here's exactly what those limits are.

The RAM math: what's actually free

On an Apple Silicon Mac, the CPU, GPU, and Neural Engine all share one unified memory pool. That's good news for AI performance — the model doesn't have to copy data across a PCIe bus. The less good news: macOS itself takes a bite before your model ever loads.

With a typical app load (browser, a few tabs, maybe a notes app), macOS and your running apps consume roughly 2–4GB of that 8GB. That leaves somewhere between 4 and 6GB available for a model. Some models fit easily in that window. Others don't fit at all.

The number that matters for local AI is wired memory — the portion the GPU holds and won't swap out. At 4-bit quantization (the standard compression format for running models locally), wired usage is much lower than the raw parameter count would suggest, which is why 7B-parameter models don't actually need 7GB.

Which model sizes fit on 8GB

Here's how common model sizes map to real wired memory usage at 4-bit quantization, and whether they're practical on an 8GB machine:

Model size	Wired RAM (4-bit)	Min RAM recommended	Notes
1.5B (Outlier Nano)	~1.2GB	8GB	Runs comfortably with room to spare
3B (Outlier Lite)	~2.0GB	8GB	Runs fine; leaves headroom for other apps
7B	~4.5GB	8GB	Tight — close Chrome and heavy apps first
13B	~8–9GB	16GB	Too large; will thrash swap on 8GB
27B (Outlier Core)	~17GB	24GB+	Requires a machine with substantially more RAM

The cutoff is real: if a model's wired footprint exceeds available free memory, macOS falls back to swapping — writing model weights to the SSD and reading them back on demand. That slows inference to roughly 1–3 tok/s, causes the fan to spin up, and generates noticeable heat. It's not a crash, but it's not usable either. The fix is staying within your memory budget, not fighting it.

What Nano and Lite are actually good for

Small doesn't mean useless. Nano (1.5B) and Lite (3B) handle a wide range of everyday tasks well:

Chat and Q&A. Answering questions, explaining concepts, working through ideas. Nano keeps up faster than you can read at 32 tok/s.
Writing and editing. Drafting emails, rewriting paragraphs, tightening copy. These are exactly the tasks where speed matters and where a small model has enough context to do good work.
Summarization. Paste in a long document, get the key points. Works well locally with no data leaving your machine.
Straightforward Q&A over your own files. Private, offline, and fast.

Where Nano and Lite will show their limits: complex multi-step coding problems, deep reasoning chains, or tasks that need the model to hold a lot of context and think carefully. For those, a 27B or larger model makes a noticeable difference — but that's a 16GB or 24GB conversation, not an 8GB one. For daily writing, research, and quick coding help, the small tiers are genuinely capable.

Pushing it with 7B: what to close, what to expect

A 7B model at 4-bit quantization uses around 4.5GB wired. That fits on an 8GB Mac — barely. Whether it runs well depends entirely on what else you have open.

Before loading a 7B model on an 8GB machine, quit (not minimize — actually quit) the apps eating memory:

Close Safari or Chrome, or at minimum close all but a handful of tabs
Quit Slack, Teams, or any Electron-based app
Quit mail clients, music apps, and anything running in the background

With a clean slate, you should have 5–6GB free, which is enough margin for a 7B model to load without hitting swap. Inference won't be as snappy as Nano — expect something closer to 10–15 tok/s depending on context length — but it will run, and the quality step-up over Lite is real for tasks that need it.

The moment macOS starts swapping — you'll notice the fan, the heat, and responses trickling in one token at a time — step back down to Nano or Lite. No model is worth that.

When you'll want more RAM

8GB is a workable machine for local AI. It's not a fully unconstrained one. Here's when the 16GB upgrade genuinely earns its cost:

You want to run a 7B model without closing everything else first.
You want to try 13B or larger models, which simply don't fit on 8GB without swap thrashing.
You do long context work (large codebases, multi-document summarization) where memory pressure is higher.
You switch between AI and other memory-hungry apps constantly and don't want to manage what's open.

If you're primarily using AI for chat, writing, and Q&A — and you're happy with Nano and Lite — 8GB is enough and the $200 upgrade isn't necessary. If you want to run 7B routinely without thinking about it, or if you're curious about larger models, 16GB removes the friction. It's a real trade-off, not a marketing upsell.

Step by step: running AI on your 8GB Mac with Outlier

Download Outlier from outlier.host. It's a signed Mac app — no terminal, no Python, no account required.
Open Outlier. On first launch, it will download the Nano model. The file is small (around 1.2GB) and downloads once.
Start chatting. Nano loads fast and responds at 32 tok/s on M4 Air — faster than typical reading pace. Try a summary, a draft, a question.
Switch to Lite when you want more depth on a task. The 3B model uses ~2GB wired and is included free alongside Nano.
To try 7B, quit your heavy background apps first (see above), then select a 7B model in Outlier's model picker. Watch Activity Monitor → Memory if you want to verify you're not swapping.

Everything runs on your chip, stays on your disk, and never touches the internet. The Wi-Fi off, airplane-mode test works out of the box.

Receipts: Nano speed (32 tok/s) measured on M4 MacBook Air 8GB running Outlier. Lite speed (~18–20 tok/s) measured on the same machine. Wired RAM figures are approximate at 4-bit quantization; actual usage varies by model architecture and context length.

Download Outlier — free

Nano and Lite are free, local, and private. No account. No subscription. Pro adds larger models for $20/mo or $149/yr. Apple Silicon only.

Download for Mac