cost-tracking
Cost Tracking Framework
When This Activates
This skill activates when:
- User asks about API costs or spending
- Concerns about expensive operations
- Need to optimize token usage
Token Cost Reference
Claude Pricing (Approximate)
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Opus | ~$15 | ~$75 |
| Sonnet | ~$3 | ~$15 |
| Haiku | ~$0.25 | ~$1.25 |
Typical Operation Costs
| Operation | Tokens | Approximate Cost |
|---|---|---|
| Simple question | 500-2K | $0.01-0.05 |
| File read + analysis | 2-10K | $0.05-0.25 |
| Code generation | 5-20K | $0.15-0.50 |
| Multi-file refactor | 20-100K | $0.50-2.50 |
| Long conversation | 50-200K | $1.00-5.00 |
Cost Optimization Strategies
1. Route to Local LLM (FREE)
Use local_ask for simple tasks:
# FREE - no API cost
local_ask question="where is the login function?"
local_ask question="explain this error" mode=explain
local_review file="src/auth.ts" focus=bugs
Good for local:
- Simple lookups ("where is X?")
- Code explanations
- Commit message generation
- Quick code reviews
2. Use Memory Tools First
Pre-indexed memory is instant and free:
# Instant, no API cost
memory_query "authentication flow"
memory_functions name="handleLogin"
smart_read path="src/auth.ts" detail=summary
3. Reduce Context Size
- Use
smart_readwithdetail=summarybeforedetail=full - Truncate large files to relevant sections
- Clear conversation when changing topics
4. Batch Related Questions
Instead of 5 separate messages, combine:
"Can you: 1) explain the auth flow, 2) find the login
component, 3) check for security issues, and 4) suggest
improvements?"
Gateway Metrics
Check current efficiency:
gateway_metrics format=summary
Returns:
- Cache hit rate
- Token savings
- Routing breakdown (local vs API)
Cost Estimation
Before expensive operations:
This refactor will touch ~20 files.
Estimated cost: $0.50-1.00
Proceed? [Y/n]
Budget Awareness
Daily Patterns
- Morning: Fresh context, lower cost
- Long sessions: Context grows, higher cost
- After compaction: Reset context, lower cost
High-Cost Triggers
- "Analyze entire codebase"
- "Review all files in directory"
- "Generate comprehensive documentation"
- Very long conversations (>50 turns)
Saving Tips
- Start fresh for new topics - Don't carry irrelevant context
- Use subagents - They have focused context
- Check memory first - Summaries save full file reads
- Compress transcripts - Archived sessions are compressed
- Local for simple tasks - Ollama is free
Monitoring Commands
# Check gateway efficiency
python3 ~/.claude-dash/learning/efficiency_tracker.py --report
# View session sizes
du -sh ~/.claude-dash/sessions/*
More from jamelna-apps/claude-dash
page-cro
When the user mentions "conversion", "CRO", "landing page", "not converting", "bounce rate", "optimize page", or asks about improving page performance.
7smart-routing
When deciding which Claude model (Opus/Sonnet/Haiku) to use, or when "route", "which model", "complex task", "multi-file", "architectural", or "deep debugging" is mentioned. Guides quality-first model selection.
3testing-strategy
When the user mentions "test", "testing", "unit test", "integration", "e2e", "coverage", "TDD", "mock", or asks about testing approaches.
3git-workflow
When user mentions "commit", "branch", "merge", "git history", "what changed", "since last session", "PR", or needs git context. Provides session continuity via git awareness.
3index-freshness
When user mentions "stale", "outdated", "reindex", "sync", "refresh index", "embeddings outdated", or when search results seem wrong. Guides index maintenance decisions.
3refactor-guide
When the user mentions "refactor", "clean up", "technical debt", "restructure", "reorganize", "improve code quality", or wants to improve existing code without changing behavior.
3