Pandas-style transforms on a small dataframe. The hosted LLM APIs category answers this with a remote model and an account; Outlier answers it with the on-device lite tier. This page is the side-by-side specifically for data cleaning workloads.
For data cleaning, the deciding axis between Outlier and hosted LLM APIs is the data path. Pay-per-token APIs where each request is logged on the provider side. Outlier holds the prompt and response for data cleaning on the Mac and delivers tokens at the local memory bandwidth of the chip; on the lite-recommended tier, that means working out of the on-disk checkpoint without a network round-trip per turn.
Network round-trip every prompt. For a data cleaning workflow against hosted LLM APIs, the practical consequences are tail-latency variance (the network adds unbounded variance per turn) and exposure to provider-side logging of the data cleaning prompts. Outlier’s chat path on the lite tier issues no outbound HTTPS once the model is on disk; the only network request in the lifecycle is the one-time 5.04 GB tier download from Hugging Face.
If you are coming from hosted LLM APIs for data cleaning, the right starting point on Outlier is the lite tier — 5.04 GB on disk, sitting at the quality-vs-speed inflection point for data cleaning-shaped prompts. hosted LLM APIs users typically want what they had plus privacy; the lite tier is the closest match for that without giving up answer quality. Heavier work moves up to the higher tiers in the same app; the Quick tier’s weak code performance rules it out for code-shaped data cleaning.
Moving a data cleaning workflow from hosted LLM APIs to Outlier is a one-time DMG install plus a 5.04 GB pull for the lite tier. The sign-in step that hosted LLM APIs typically requires has no equivalent on the Outlier side: there is no account, no per-token meter, and no rate-limit page to redirect through. The data cleaning loop after install is open-prompt to local-decode.
For a data cleaning workload moving off hosted LLM APIs onto the lite tier: Lite sits on a 9B base — a step up in reasoning and code over Nano while still fitting a 16 GB Mac. See the MLX explainer for the per-tier breakdown. The one formally measured Outlier accuracy figure is Nano HumanEval 81.1% (pass@1, full 164-set).
Pandas-style work is interactive: try a transform, inspect, refine. Local decode keeps the loop tight; round-trip variance drops.
For data cleaning specifically, hosted LLM APIs tools share a common operational shape: a sign-in, an auth token bound to that sign-in, some kind of metered usage, and a content policy that applies to the data cleaning prompts you submit. Outlier’s local-only chat path does not surface any of those: the data cleaning workflow runs against the on-disk lite tier, no token leaves the device.
This page positions Outlier as an alternative to hosted LLM APIs for data cleaning workflows, not as a drop-in replacement. Specific product surfaces in the hosted LLM APIs category — IDE-integrated suggestions, web-based shared sessions, team-managed prompt libraries — are out of scope for the local app loop and we do not claim equivalence for those when data cleaning is part of a larger team workflow.
For data cleaning: one network round-trip per prompt with hosted LLM APIs versus zero round-trips with Outlier on the lite tier — the difference is unbounded latency variance against bandwidth-bound, repeatable local throughput.
Download Outlier for MacRequires Apple Silicon (M1, M2, M3, or M4) — Intel Macs are not supported. macOS 12+.
Outlier runs entirely on your Mac. No prompts leave the device. macOS 12+ on Apple Silicon (arm64). Apache 2.0 model weights. Back to home.