Ternary quantization stores model weights as three values (typically -1, 0, +1) plus a per-channel scale. The compression is aggressive (around 1.6 bits per weight) and the matmul becomes a sign-and-add instead of a multiply, which is friendly to commodity hardware.
The decision to run a model locally on a Mac comes down to three numbers: weight size on disk, peak generation memory, and the memory bandwidth feeding the decode loop. The concept above bears directly on each of those.
Ternary quantization replaces each floating-point weight with one of three values (commonly −1, 0, +1) plus a per-channel or per-group scale factor. The storage footprint drops to roughly 1.6 bits per weight, and the multiply-accumulate kernel becomes a sign-and-add — friendly to commodity CPUs without dedicated matmul hardware.
The distillation pipeline that recovers quality after the round-down to ternary is the research direction Outlier’s 3 provisional patents (April 2026) cover.
Outlier’s shipping models are MLX 4-bit, not ternary; ternary is a research direction covered by 3 provisional patents filed in April 2026.
Today’s shipping Outlier tiers are MLX 4-bit, not ternary. Ternary is the next research milestone, not the current shipping format.
Patents: USPTO #64/026,886, #64/030,368, #64/034,028 (3 provisional, 61 claims, filed April 3, 6, 9 of 2026).
This concept is sometimes invoked as a marketing word for “what is ternary quantization”. The number cited above — Outlier’s shipping models are MLX 4-bit, not ternary; ternary is a researc… — is the empirically measured one. If a cleaner number appears in someone’s pitch deck, ask for the provenance file that produced it; if there is no provenance file, treat the number as marketing.
The provisional patent texts (USPTO #64/026,886, #64/030,368, #64/034,028) are the canonical reference for the ternary direction. The shipping app source does not implement ternary today.
The matmul kernel for ternary weights is structurally different from a 4-bit kernel. Where 4-bit needs an integer multiply per element (against the dequantized scale), ternary reduces to a sign flip plus an accumulation. That is friendly to vector units without dedicated low-precision matmul (Apple’s P-cores, ARM Neoverse, x86 AVX2).
The catch is quality. Naive round-to-ternary loses several points of MMLU. The patent filings cover a distillation pipeline that recovers most of the lost quality at training time. The shipping Outlier tiers are MLX 4-bit precisely because the ternary research is not yet at parity for the 27B class on which Core/Code ride.
Today this concept does not ship in any Outlier tier. The shipping tiers are MLX 4-bit. The ternary direction is research and is what the three April-2026 patent filings cover.
None of the shipping Outlier tiers run ternary. To exercise the concept today you would run a separate ternary-quantized checkpoint outside Outlier; the shipping app does not surface this.
Outlier’s shipping models are MLX 4-bit, not ternary; ternary is a research direction covered by 3 provisional patents filed in April 2026.
Download Outlier for MacRequires Apple Silicon (M1, M2, M3, or M4) — Intel Macs are not supported. macOS 12+.
Outlier runs entirely on your Mac. No prompts leave the device. macOS 12+ on Apple Silicon (arm64). Apache 2.0 model weights. Back to home.