mac-code — Free Local AI Agent on Apple Silicon

Skill by ara.so — Daily 2026 Skills collection.

Run a 35B reasoning model locally on your Mac for $0/month. mac-code is a CLI AI coding agent (Claude Code alternative) that routes tasks — web search, shell commands, file edits, chat — through a local LLM. Supports llama.cpp (30 tok/s) and MLX (64K context, persistent KV cache) backends on Apple Silicon.

What It Does

LLM-as-router: The model classifies every prompt as search, shell, or chat and routes accordingly
35B MoE at 30 tok/s via llama.cpp + IQ2_M quantization (fits in 16 GB RAM)
35B full Q4 on 16 GB via custom MoE Expert Sniper (1.54 tok/s, only 1.42 GB RAM used)
9B at 64K context via quantized KV cache (q4_0 keys/values)
MLX backend adds persistent KV cache save/load, context compression, R2 sync
Tools: DuckDuckGo search, shell execution, file read/write

mac-code-local-ai-agent

mac-code — Free Local AI Agent on Apple Silicon

What It Does