skills/adaptationio/skrillz/agent-cost-optimizer

agent-cost-optimizer

SKILL.md

Agent Cost Optimizer

Overview

agent-cost-optimizer provides comprehensive cost tracking, budget enforcement, and ROI measurement for AI agent operations.

Purpose: Control and optimize AI spending while maximizing value delivered

Pattern: Task-based (7 operations for cost management)

Key Innovation: Real-time cost tracking with automatic budget enforcement and cost-effective fallbacks

Industry Context (2025):

  • Average AI spending: $85,521/month (36% YoY increase)
  • Only 50% of organizations can measure AI ROI
  • IT teams struggle with hidden costs

Solution: Comprehensive cost management from tracking to optimization


When to Use

Use agent-cost-optimizer when:

  • Tracking AI costs across skills
  • Preventing budget overruns
  • Measuring ROI (cost vs. value delivered)
  • Optimizing model selection (Opus vs. Sonnet vs. Haiku)
  • Planning AI budgets
  • Cost-effective development
  • Enterprise cost accountability

Prerequisites

Required

  • AI API access (Anthropic, OpenAI, Google)
  • Cost tracking capability (API usage data)

Optional

  • Paid.ai or AgentOps integration (advanced cost tracking)
  • Prometheus/Grafana (cost visualization)
  • Budget approval workflow

Cost Operations

Operation 1: Track Token Usage

Purpose: Monitor token consumption per skill invocation

Process:

  1. Initialize Tracking:

    {
      "tracking_id": "track_20250126_1200",
      "skill": "multi-ai-verification",
      "started_at": "2025-01-26T12:00:00Z",
      "tokens": {
        "prompt": 0,
        "completion": 0,
        "total": 0
      },
      "cost": {
        "amount_usd": 0.00,
        "model": "claude-sonnet-4-5"
      }
    }
    
  2. Track During Execution:

    // After each AI call
    trackTokens({
      prompt_tokens: response.usage.input_tokens,
      completion_tokens: response.usage.output_tokens,
      model: 'claude-sonnet-4-5'
    });
    
    // Update running totals
    
  3. Finalize Tracking:

    {
      "tracking_id": "track_20250126_1200",
      "skill": "multi-ai-verification",
      "completed_at": "2025-01-26T12:45:00Z",
      "duration_minutes": 45,
      "tokens": {
        "prompt": 15234,
        "completion": 8932,
        "total": 24166
      },
      "cost": {
        "amount_usd": 0.073,
        "model": "claude-sonnet-4-5",
        "rate": "$3 per million tokens"
      }
    }
    
  4. Save to Cost Log:

    # Append to daily cost log
    cat tracking.json >> .cost-tracking/$(date +%Y-%m-%d).json
    

Outputs:

  • Token usage per invocation
  • Cost per invocation
  • Model used
  • Daily/monthly aggregates

Validation:

  • Tracking initialized
  • Tokens counted accurately
  • Cost calculated correctly
  • Logs saved

Time Estimate: Automatic (integrated into skills)


Operation 2: Calculate Costs

Purpose: Compute accurate costs based on provider pricing

Pricing (2025 rates):

Anthropic (Claude):

Model Input (per MTok) Output (per MTok)
Claude Opus 4.5 $15 $75
Claude Sonnet 4.5 $3 $15
Claude Haiku 4.5 $0.80 $4

OpenAI (Codex):

Model Input Output
GPT-5.1-codex $5 $15
o3 $10 $40
o4-mini $1.50 $6

Google (Gemini):

Model Input Output
Gemini 2.5 Pro $1.25 $5
Gemini 2.5 Flash $0.15 $0.60

Process:

function calculateCost(usage, model) {
  const pricing = {
    'claude-sonnet-4-5': { input: 3, output: 15 },
    'claude-haiku-4-5': { input: 0.80, output: 4 },
    'claude-opus-4-5': { input: 15, output: 75 },
    // ... more models
  };

  const rates = pricing[model];
  const inputCost = (usage.prompt_tokens / 1_000_000) * rates.input;
  const outputCost = (usage.completion_tokens / 1_000_000) * rates.output;

  return {
    input_cost: inputCost,
    output_cost: outputCost,
    total_cost: inputCost + outputCost,
    currency: 'USD'
  };
}

Outputs:

  • Accurate cost per invocation
  • Model-specific pricing
  • Input vs. output cost breakdown

Operation 3: Enforce Budget Caps

Purpose: Prevent exceeding monthly budget limits

Process:

  1. Set Budget:

    {
      "monthly_budget_usd": 100,
      "skill_budgets": {
        "multi-ai-verification": 70,
        "multi-ai-research": 20,
        "multi-ai-testing": 10
      },
      "alert_thresholds": {
        "warning": 0.80,
        "critical": 0.95
      }
    }
    
  2. Check Before Operation:

    async function checkBudget(skill, estimated_cost) {
      const usage = getCurrentMonthUsage();
      const remaining = budget.monthly_budget_usd - usage.total_cost;
    
      if (estimated_cost > remaining) {
        return {
          allowed: false,
          reason: `Budget exceeded: $${usage.total_cost}/$${budget.monthly_budget_usd}`,
          overage: estimated_cost - remaining
        };
      }
    
      if (usage.total_cost / budget.monthly_budget_usd > 0.80) {
        return {
          allowed: true,
          warning: `80% of monthly budget used ($${usage.total_cost}/$${budget.monthly_budget_usd})`
        };
      }
    
      return { allowed: true };
    }
    
  3. Enforce:

    const budgetCheck = await checkBudget('multi-ai-verification', 0.50);
    
    if (!budgetCheck.allowed) {
      console.log(`❌ Budget exceeded: ${budgetCheck.reason}`);
      console.log(`Options:`);
      console.log(`  1. Use cheaper model (Sonnet → Haiku)`);
      console.log(`  2. Skip optional verification layers`);
      console.log(`  3. Request budget increase`);
      return;
    }
    
    if (budgetCheck.warning) {
      console.log(`⚠️ ${budgetCheck.warning}`);
    }
    
    // Proceed with operation
    

Outputs:

  • Budget status checked
  • Operations blocked if over budget
  • Warnings at 80%
  • Cost-effective alternatives suggested

Operation 4: Optimize Model Selection

Purpose: Choose cost-effective model for each task

Decision Matrix:

Task Type Recommended Model Cost Rationale
Simple verification (Layer 1-2) Haiku $ Rules-based, fast, cheap
Code generation Sonnet $$ Balanced quality/cost
Complex reasoning (architecture) Opus $$$ Best quality, worth premium
LLM-as-judge Sonnet or external model $$ Good judgment, reasonable cost
Test generation Sonnet $$ Comprehensive coverage needed
Research Sonnet/Haiku mix $-$$ Haiku for search, Sonnet for synthesis

Auto-Optimization:

function selectModel(task_type, criticality, budget_remaining) {
  // Critical + budget OK → Use Opus
  if (criticality === 'critical' && budget_remaining > 20) {
    return 'claude-opus-4-5';
  }

  // Standard → Use Sonnet
  if (criticality === 'standard') {
    return 'claude-sonnet-4-5';
  }

  // Budget low or simple task → Use Haiku
  if (budget_remaining < 5 || task_type === 'simple') {
    return 'claude-haiku-4-5';
  }

  return 'claude-sonnet-4-5'; // Default
}

Outputs:

  • Optimal model selected
  • Cost minimized
  • Quality maintained

Operation 5: Cost-Effective Caching

Purpose: Avoid re-computing identical operations

Process:

  1. Cache Key Generation:

    function generateCacheKey(operation, inputs) {
      // Hash inputs to create unique key
      const content_hash = crypto
        .createHash('sha256')
        .update(JSON.stringify(inputs))
        .digest('hex');
    
      return `${operation}_${content_hash}`;
    }
    
  2. Check Cache Before Operation:

    const cacheKey = generateCacheKey('verify_code', {
      files: ['src/auth.ts'],
      file_hashes: {'src/auth.ts': 'abc123'}
    });
    
    const cached = readCache(cacheKey);
    
    if (cached && !isExpired(cached, 24)) {
      // Use cached result
      console.log('📦 Using cached verification result');
      return cached.result;
    }
    
    // Cache miss → run verification
    const result = await runVerification();
    
    // Save to cache
    saveCache(cacheKey, result, ttl: 24 hours);
    
  3. Cache Structure:

    {
      "cache_key": "verify_code_abc123def456",
      "created_at": "2025-01-26T12:00:00Z",
      "expires_at": "2025-01-27T12:00:00Z",
      "inputs": {
        "files": ["src/auth.ts"],
        "file_hashes": {"src/auth.ts": "abc123"}
      },
      "result": {
        "quality_score": 92,
        "layers_passed": 5,
        "issues": []
      },
      "cost_saved": 0.073
    }
    

Outputs:

  • 90% reduction in re-verification costs
  • Instant results for unchanged code
  • Cache hit/miss tracking

Validation:

  • Cache key correctly identifies identical operations
  • File changes invalidate cache
  • Expired cache not used
  • Cost savings tracked

Operation 6: Measure ROI

Purpose: Calculate return on investment for AI spending

Process:

  1. Track Time Saved:

    {
      "task": "Implement user authentication",
      "without_ai": {
        "estimated_hours": 40,
        "developer_rate": 100,
        "total_cost": 4000
      },
      "with_ai": {
        "actual_hours": 11.3,
        "developer_rate": 100,
        "developer_cost": 1130,
        "ai_cost": 2.50,
        "total_cost": 1132.50
      },
      "roi": {
        "time_saved_hours": 28.7,
        "cost_saved": 2867.50,
        "roi_percentage": 253,
        "payback_period_hours": 0.025
      }
    }
    
  2. Calculate ROI:

    function calculateROI(task) {
      const time_saved = task.without_ai.estimated_hours - task.with_ai.actual_hours;
      const cost_saved = task.without_ai.total_cost - task.with_ai.total_cost;
      const roi_percentage = (cost_saved / task.with_ai.ai_cost) * 100;
    
      return {
        time_saved_hours: time_saved,
        cost_saved_usd: cost_saved,
        roi_percentage: roi_percentage,
        payback_period: task.with_ai.ai_cost / (task.without_ai.developer_rate * (time_saved / 40)) // weeks
      };
    }
    
  3. Monthly ROI Report:

    # Monthly ROI Report - January 2025
    
    ## AI Spending
    - Total AI costs: $87.50
    - Breakdown:
      - multi-ai-verification: $62.30 (71%)
      - multi-ai-research: $18.40 (21%)
      - multi-ai-testing: $6.80 (8%)
    
    ## Time Savings
    - Tasks completed: 8
    - Total hours saved: 156 hours
    - Average savings per task: 19.5 hours
    
    ## Cost Savings
    - Developer cost avoided: $15,600 (156h × $100/h)
    - AI costs: $87.50
    - Net savings: $15,512.50
    
    ## ROI
    - ROI: 17,728% ($177 saved per $1 spent)
    - Payback period: 0.03 weeks (immediate)
    - Value multiplier: 178x
    
    ## Recommendations
    - ✅ Current spending highly cost-effective
    - Continue using AI for all qualifying tasks
    - Consider increasing budget (high ROI)
    

Outputs:

  • Comprehensive ROI analysis
  • Time and cost savings quantified
  • Monthly reports
  • Business justification for AI spending

Operation 7: Predict Costs

Purpose: Estimate costs before starting expensive operations

Process:

  1. Historical Data:

    {
      "operation": "multi-ai-verification",
      "mode": "all_5_layers",
      "historical_costs": [
        {"date": "2025-01-15", "tokens": 24166, "cost": 0.073},
        {"date": "2025-01-18", "tokens": 21893, "cost": 0.066},
        {"date": "2025-01-20", "tokens": 26543, "cost": 0.080}
      ],
      "avg_cost": 0.073,
      "std_dev": 0.007
    }
    
  2. Predict Before Operation:

    const prediction = predictCost('multi-ai-verification', {
      mode: 'all_5_layers',
      code_size_lines: 850
    });
    
    console.log(`💰 Estimated cost: $${prediction.estimated_cost} ± $${prediction.std_dev}`);
    console.log(`   Range: $${prediction.min_cost} - $${prediction.max_cost}`);
    console.log(`   Confidence: ${prediction.confidence}%`);
    
    // Check budget
    if (prediction.estimated_cost > budget_remaining) {
      console.log(`⚠️ Estimated cost exceeds remaining budget`);
      console.log(`Options:`);
      console.log(`  1. Use Haiku (estimated: $${prediction.estimated_cost * 0.27})`);
      console.log(`  2. Skip Layer 5 (save ~60%: $${prediction.estimated_cost * 0.4})`);
      console.log(`  3. Increase budget`);
    }
    

Outputs:

  • Cost predictions with confidence intervals
  • Budget impact assessment
  • Cost-effective alternatives suggested

Cost Optimization Strategies

Strategy 1: Model Selection

Baseline (All Sonnet): $100/month

Optimized (Smart selection):

  • Layer 1-2: Haiku ($20)
  • Layer 3-4: Sonnet ($40)
  • Layer 5: Sonnet ($30)
  • Total: $90/month (10% savings)

Aggressive (Maximum savings):

  • Layer 1-2: Haiku ($20)
  • Layer 3-4: Haiku ($15)
  • Layer 5: Sonnet ($30)
  • Total: $65/month (35% savings)

Trade-off: Some quality reduction at Layers 3-4


Strategy 2: Caching

Without Caching: Re-verify same code multiple times

With Caching (24-hour TTL):

  • First verification: $0.073
  • Same code next 24h: $0 (cache hit)
  • Savings: 90% on unchanged code

Implementation:

const cacheKey = hash(files_to_verify);
const cached = getCache(cacheKey);

if (cached && !isExpired(cached, 24)) {
  return cached.result; // $0 cost
}

const result = await verify(); // $0.073 cost
saveCache(cacheKey, result);

Strategy 3: Ensemble Optimization

Baseline (Always 5-agent ensemble):

  • 5 agents × $0.073 = $0.365 per verification

Optimized (Conditional ensemble):

  • Critical features: 5 agents ($0.365)
  • Standard features: 3 agents ($0.219)
  • Simple changes: 1 agent ($0.073)
  • Average savings: 60%

Decision Logic:

function shouldUseEnsemble(criticality, code_size, budget) {
  if (criticality === 'critical') return 5;
  if (criticality === 'high' && budget > 20) return 3;
  return 1; // Single agent
}

Strategy 4: Layer Skipping

Full Verification (All 5 layers): ~$0.073

Fast-Track (Layers 1-2 only):

  • Rules + Functional only
  • ~$0.015 (80% savings)
  • Use for: minor changes, docs, config

Decision:

if (lines_changed < 50 && files_changed.every(f => !isCritical(f))) {
  // Fast-track: Layers 1-2 only
  mode = 'fast_track';
  estimated_cost = 0.015;
} else {
  // Full verification
  mode = 'standard';
  estimated_cost = 0.073;
}

Budget Management

Monthly Budget Planning

Sample Budget ($100/month):

{
  "monthly_budget_usd": 100,
  "allocation": {
    "multi-ai-verification": {
      "budget": 70,
      "rationale": "Most expensive (LLM-as-judge)"
    },
    "multi-ai-research": {
      "budget": 20,
      "rationale": "Occasional use, tri-AI"
    },
    "multi-ai-testing": {
      "budget": 10,
      "rationale": "Mostly automated"
    },
    "buffer": 10
  },
  "assumptions": {
    "features_per_month": 8,
    "verifications_per_feature": 1.5,
    "research_per_month": 2
  }
}

Budget Tracking

Daily:

# Check today's spending
cat .cost-tracking/$(date +%Y-%m-%d).json | jq '[.[] | .cost.amount_usd] | add'

# Output: $3.45 today

Monthly:

# Check month-to-date
cat .cost-tracking/2025-01-*.json | jq '[.[] | .cost.amount_usd] | add'

# Output: $67.80 this month (68% of budget)

Projection:

// Project end-of-month
const days_elapsed = 26;
const days_in_month = 31;
const current_spend = 67.80;

const projected = (current_spend / days_elapsed) * days_in_month;
// = $81.14 projected (within budget ✅)

Cost Alerting

Alert Levels

80% Budget (Warning):

⚠️  BUDGET ALERT: 80% Used

**Current**: $80.00 / $100.00 (80%)
**Remaining**: $20.00
**Days left**: 5

**Projected EOMs**: $93.75 (within budget)

**Recommendations**:
- Monitor spending closely
- Use Haiku for simple tasks
- Cache aggressively
- Skip optional layers where safe

95% Budget (Critical):

🚨 CRITICAL: 95% Budget Used

**Current**: $95.00 / $100.00 (95%)
**Remaining**: $5.00
**Days left**: 5

**Projected EOM**: $110 (OVER BUDGET)

**Actions Required**:
1. Pause non-critical verifications
2. Use Haiku exclusively
3. Request budget increase OR
4. Defer work to next month

**Auto-throttling**: Enabled
- Only critical operations allowed
- All optional layers disabled
- Ensemble verification disabled

Budget Exceeded:

❌ BUDGET EXCEEDED

**Current**: $102.50 / $100.00 (102.5%)
**Overage**: $2.50

**Operations BLOCKED** until:
1. Budget increased OR
2. Next month (resets automatically)

**Emergency Override**: Requires approval

ROI Calculation Framework

Value Metrics

Quantifiable Value:

  • Time saved (hours)
  • Cost saved (developer time avoided)
  • Quality improvement (fewer bugs in production)
  • Faster time-to-market (days)

Formula:

ROI = ((Value Delivered - AI Costs) / AI Costs) × 100%

Example:

Feature without AI: 40 hours × $100/hour = $4,000
Feature with AI: 11.3 hours × $100/hour + $2.50 AI = $1,132.50

Value Delivered = $4,000 - $1,130 = $2,870
AI Costs = $2.50
ROI = ($2,870 / $2.50) × 100% = 114,800%

Monthly Reporting

Template:

# AI ROI Report - January 2025

## Summary
- **AI Spending**: $87.50
- **Value Delivered**: $15,600 (156 hours saved × $100/hour)
- **ROI**: 17,728%
- **Payback**: Immediate

## Details

### Features Delivered (8)
1. User authentication - 11.3h (was 40h), ROI: 114,800%
2. Payment integration - 8.5h (was 32h), ROI: 108,235%
[... more ...]

### Cost Breakdown
- Verification: $62.30 (71%) - Highest cost, highest value
- Research: $18.40 (21%) - Occasional, high impact
- Testing: $6.80 (8%) - Mostly automated, low cost

### Savings
- Time: 156 hours saved
- Cost: $15,512.50 net savings
- Quality: 4 bugs prevented (saved ~20 hours)

### Recommendations
- ✅ ROI is excellent (17,728%)
- Consider increasing budget (high returns)
- Current spending optimal

Quick Reference

Cost Operations

Operation Purpose Time Automation
Track Monitor token usage Automatic 100%
Calculate Compute costs Automatic 100%
Enforce Budget caps Automatic 100%
Optimize Model selection Semi-auto 70%
Cache Avoid re-compute Automatic 100%
Measure ROI Value analysis Manual 30%
Predict Cost estimation Automatic 90%

Cost Optimization Strategies

Strategy Savings Trade-off Recommended For
Model selection 10-35% Some quality loss All features
Caching 90% Stale results risk Unchanged code
Ensemble optimization 60% Lower confidence Non-critical
Layer skipping 80% Less thorough Minor changes

Budget Thresholds

  • < 80%: Normal operation
  • 80-95%: Warning, optimize
  • 95-100%: Critical, throttle
  • > 100%: Block operations

agent-cost-optimizer ensures cost-effective AI operations through real-time tracking, budget enforcement, model optimization, and ROI measurement - preventing budget overruns while maximizing value delivered.

For cost reports, see examples/. For optimization strategies, see Cost Optimization Strategies section.

Weekly Installs
1
Installed on
claude-code1