Does this affect which tier I should choose?

Yes. Concept-level differences map directly to tier choice; the run pages on this site work the mapping out for the common Mac SKUs in the lineup.

Is the shipping Outlier model based on this?

Read the body above for the specific answer; some of these concepts ship today and some are research directions covered by the April 2026 patent filings.

Where can I see the raw provenance?

Every benchmark on the site links to a source file path in the public Outlier repo, with the original command and stderr preserved.

Concept

What is K_override on the Plus tier?

Last updated 2026-06-18 · Outlier v1.11.469

Quick answer

K_override sets how many experts the paged engine keeps resident in RAM at any moment. Higher K reduces cache misses but inflates RAM and can regress decode speed if the model is compute-bound rather than I/O-bound.

Why does what is k_override on the plus tier matter for local AI on Apple Silicon?

The decision to run a model locally on a Mac comes down to three numbers: weight size on disk, peak generation memory, and the memory bandwidth feeding the decode loop. The concept above bears directly on each of those.

K_override sets the resident expert pool size on Outlier’s Plus tier. The model’s declared num_experts_per_tok is 10; K_override is how many experts the engine keeps in RAM at any moment. Higher K reduces cache misses on cold expert selections but inflates RAM and can regress decode speed if the model is compute-bound at this size.

The 2026-05-01 sweep tested K=4, K=20, K=32, K=48 with a 5-prompt aggregate. K=4 fails coherence (3/5), K=32 regresses 2.5% with 60% more RAM, K=48 adds 1.3% (within noise) at 33.78 GB peak. K=20 gives 1.59 tok/s with 14.04 GB peak gen and is locked.

What is the concrete number?

On Outlier’s Plus tier, K=20 produces 1.59 tok/s with a 14.04 GB peak generation footprint, locked in by a 4-point sweep on 2026-05-01.

How does this play out in the Outlier shipping lineup?

K=4 is a hard model constraint, not a tunable: routing top-10 cannot fit in a resident pool of 4 without first-token control-token escapes.

What is the v1.9 implication?

Source: sprints/v18_plus_ship/artifacts/K_SWEEP_RESULTS.md (2026-05-01, 5-prompt agg, isolated subprocess per K, mx.metal.get_peak_memory).

What does “what is k_override on the plus tier” not mean?

This concept is sometimes invoked as a marketing word for “what is k_override on the plus tier”. The number cited above — On Outlier’s Plus tier, K=20 produces 1.59 tok/s with a 14.04 GB peak gene… — is the empirically measured one. If a cleaner number appears in someone’s pitch deck, ask for the provenance file that produced it; if there is no provenance file, treat the number as marketing.

Where can I read more about what is k_override on the plus tier?

The K-sweep raw artifacts (k_sweep_K4.json, K20.json, K32.json, K48.json plus matching .log files) live under sprints/v18_plus_ship/artifacts/; the harness is k_sweep.py in the same directory.

What other knobs were tested alongside K?

The 2026-04-30 Plus-tier work also probed cache_gb (4, 8, 12), OUTLIER_MMAP_EXPERTS (0, 1), and lazy_load (True, False). cache_gb=8 dominated 4 (cold-miss penalties) and tied 12 (no cache-hit lift at 8-prompt agg). OUTLIER_MMAP_EXPERTS=1 regressed throughput 8× on macOS and was retired. lazy_load=True was kept because the streaming-loader fix (commit 95c8cc8 on the v17 branch) made it strictly better.

The locked Plus-tier configuration is therefore K_override=20, cache_gb=8.0, OUTLIER_MMAP_EXPERTS=0, lazy_load=True; this is what ships in v1.11.469.

How does “what is k_override on the plus tier” connect to specific tiers?

K_override is Plus-tier-only. The other six tiers either load entirely into unified memory or need not page experts; their inference path does not even consult the K_override knob.

What is the smallest configuration that exercises this concept?

You need at least 32 GB unified memory to load Plus with K_override=20. K_override=4 is testable at lower RAM but fails coherence; the locked production knob is K=20 and that is what the bench machine runs.

One unique number