Practical guides to running big local models, coding agents, deep research, and computer use offline on Apple Silicon.
Claude Code is a coding agent that edits files and runs commands. Here's how to get the same workflow with a local model on macOS, fully offline.
Computer use lets an AI see your screen and operate your Mac. Here's how to run that workflow locally, what's possible offline, and the guardrails.
Which model tiers run comfortably on each M-series Mac Studio, the RAM tradeoffs, and the actual tok/s numbers from a measured M1 Ultra.
A coding agent that edits files, runs tests, and reads your project — no API key, no cloud, no rate limit. Setup, the MCP layer, and what it can and…
A practical guide to running an AI coding workflow with no internet, no API keys, and no cloud, on M1/M2/M3/M4 Macs. Costs, tradeoffs, and the actual…
What 'private local AI' actually guarantees on a Mac: the data path, the telemetry surface, the usage model, and where the boundary really sits.
Qwen3.5-397B-A17B at MLX 4-bit is a 209 GB model. Here's how to run it on 64 GB of unified memory using paged expert streaming, and the tradeoff.
Qwen3.5-397B-A17B normally needs hundreds of GB of RAM. Here's how paged expert streaming runs it on a 64 GB Mac Studio at ~2 tokens/sec.
Free Nano + Lite. Pro $20/mo or $149/yr adds everything (Plus 397B included). Lifetime Pro from $99 (Founding 200) or $200 (Founders 500). Apple Silicon only.
Download for Mac