MLX is Apple’s array framework for Apple Silicon. It exposes the unified GPU and CPU as one device, avoids the host–device copy that CUDA frameworks require, and ships a quantization toolkit that targets the 4-bit dense format Outlier uses for every shipping tier except Plus.
The decision to run a model locally on a Mac comes down to three numbers: weight size on disk, peak generation memory, and the memory bandwidth feeding the decode loop. The concept above bears directly on each of those.
MLX is Apple’s array framework, released open-source in late 2023. Its core design choice is unified-memory-first: an array lives in one address space and is accessible to the GPU, the CPU, and the Apple Neural Engine without an explicit copy. On the Mac that means model weights load straight into the address space the GPU decodes against, no PCIe round-trip.
mlx_lm is the language-model subpackage. Outlier ships v0.31.3 inside the signed DMG.
Outlier ships mlx_lm 0.31.3 inside the signed DMG; no separate Python install is required.
The 4-bit quantization in mlx_lm is the canonical format for every Outlier tier except Plus, which stretches the unified-memory budget with a paged loader.
The Plus tier’s custom V9 paged loader sits beside mlx_lm, not inside it; mlx_lm cannot stream a 209 GB checkpoint out of the box.
This concept is sometimes invoked as a marketing word for “what is mlx and why does outlier use it”. The number cited above — Outlier ships mlx_lm 0.31.3 inside the signed DMG; no separate Python install is… — is the empirically measured one. If a cleaner number appears in someone’s pitch deck, ask for the provenance file that produced it; if there is no provenance file, treat the number as marketing.
Apple’s open-source MLX project on GitHub is the upstream; mlx_lm 0.31.3 is the specific version Outlier bundles. The integration points live in desktop_app/backend/server.py in the standard tier load path.
The Outlier sidecar is a FastAPI server packaged by PyInstaller into a single binary at Contents/Resources/outlier-cli/. mlx_lm 0.31.3 is bundled inside, with all submodules collected at build time. The Tauri front end speaks to the sidecar over http://127.0.0.1:8766; the sidecar in turn calls into mlx_lm for the standard tiers (Nano, Lite, Quick, Core, Code, Vision).
The Plus tier is the exception. mlx_lm cannot stream a 209 GB checkpoint that does not fit in unified memory, so the V9 paged loader sits next to it and intercepts the model-load and SwitchGLU forward paths. From the front end’s point of view, both engines look the same.
Every Outlier tier uses MLX 4-bit at the leaf level: Nano, Lite, Quick, Core, Code, and Vision via standard mlx_lm; Plus via the V9 paged loader sitting next to mlx_lm in the same FastAPI sidecar.
A 6 GB M1 Mac running the Nano tier is sufficient to exercise everything MLX provides for Outlier’s standard tiers. The Plus-tier additions ride on top.
Outlier ships mlx_lm 0.31.3 inside the signed DMG; no separate Python install is required.
Download Outlier for MacRequires Apple Silicon (M1, M2, M3, or M4) — Intel Macs are not supported. macOS 12+.
Outlier runs entirely on your Mac. No prompts leave the device. macOS 12+ on Apple Silicon (arm64). Apache 2.0 model weights. Back to home.