TokenOptimizer

When to Use

You have a coding task and want to send it to an LLM API with less context (fewer tokens, lower cost).
You want automatic fallback from a cheap provider to a more capable one when credits run out.
You want local LLM preprocessing to score relevance and compress context before it hits a paid API.
You need to stay within a token budget while keeping the most important context.

Quick Start

# Optimize a prompt with default strategies
token_optimizer optimize --input "Fix the bug in auth" --context src/auth.rs

# Analyze cache potential for Anthropic
token_optimizer cache-optimize --task "Add feature" --context types.rs --static-indices "0"

# Launch interactive shell (auto-selects provider)
token_optimizer interactive

# Show current config
token_optimizer config show primary

Capabilities

Capability	Description
StripWhitespace	Remove redundant whitespace, preserving code blocks
RemoveComments	Strip `//`, `/* */`, `#` comments from code
TruncateContext	Boundary-aware truncation using tiktoken token counts and priority-based boundary detection (code structure > paragraph > sentence > line > word)
Abbreviate	Shorten common programming terms in task text
LlmCompress	Compress context via local Ollama LLM
RelevanceFilter	Hybrid keyword + LLM relevance scoring; works without local LLM via keyword-only mode
ExtractSignatures	Keep only function/class/struct signatures
Deduplicate	Remove exact, whitespace-normalized, and near-duplicate context items
CachePrompting	Anthropic-compatible cache breakpoints for static content
Provider Fallback	Automatic primary -> fallback -> local provider pipeline

token-optimizer

TokenOptimizer

When to Use

Quick Start

Capabilities