NYC
skills/microck/ordinary-claude-skills/model-registry-maintainer

model-registry-maintainer

SKILL.md

Model Registry Maintainer

This skill provides guidance for maintaining MassGen's model registry across two key files:

  1. massgen/backend/capabilities.py - Models, capabilities, release dates
  2. massgen/token_manager/token_manager.py - Pricing, context windows

When to Use This Skill

  • New model released by a provider
  • Model pricing changes
  • Context window limits updated
  • Model capabilities changed
  • New provider/backend added

Two Files to Maintain

File 1: capabilities.py (Models & Features)

What it contains:

  • List of available models per provider
  • Model capabilities (web search, code execution, vision, etc.)
  • Release dates
  • Default models

Used by:

  • Config builder (--quickstart, --generate-config)
  • Documentation generation
  • Backend validation

Always update this file for new models.

File 2: token_manager.py (Pricing & Limits)

What it contains:

  • Hardcoded pricing/context windows for models NOT in LiteLLM database
  • On-demand loading from LiteLLM database (500+ models)

Used by:

  • Cost estimation
  • Token counting
  • Context management

Pricing resolution order:

  1. LiteLLM database (fetched on-demand, cached 1 hour)
  2. Hardcoded PROVIDER_PRICING (fallback only)
  3. Pattern matching heuristics

Only update PROVIDER_PRICING if:

  • Model is NOT in LiteLLM database
  • LiteLLM pricing is incorrect/outdated
  • Model is custom/internal to your organization

Information to Gather for New Models

1. Release Date

2. Context Window

  • Input context size (tokens)
  • Max output tokens
  • Look for: "context window", "max tokens", "input/output limits"

3. Pricing

4. Capabilities

  • Web search, code execution, vision, reasoning, etc.
  • Check official API documentation

5. Model Name

  • Exact API identifier (case-sensitive)
  • Check provider's model documentation

Adding a New Model - Complete Workflow

Step 1: Add to capabilities.py

Add model to the models list and model_release_dates:

# massgen/backend/capabilities.py

"openai": BackendCapabilities(
    # ... existing fields ...
    models=[
        "new-model-name",  # Add here (newest first)
        "gpt-5.1",
        # ... existing models ...
    ],
    model_release_dates={
        "new-model-name": "2025-12",  # Add here
        "gpt-5.1": "2025-11",
        # ... existing dates ...
    },
)

Step 2: Check if pricing is in LiteLLM (Usually Skip)

First, check if the model is already in LiteLLM database:

import requests

url = "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
pricing_db = requests.get(url).json()

if "new-model-name" in pricing_db:
    print("✅ Model found in LiteLLM - no need to update token_manager.py")
    print(f"Pricing: ${pricing_db['new-model-name']['input_cost_per_token']*1000}/1K input")
else:
    print("❌ Model NOT in LiteLLM - need to add to PROVIDER_PRICING")

Only if NOT in LiteLLM, add to PROVIDER_PRICING:

# massgen/token_manager/token_manager.py

PROVIDER_PRICING: Dict[str, Dict[str, ModelPricing]] = {
    "OpenAI": {
        # Format: ModelPricing(input_per_1k, output_per_1k, context_window, max_output)
        "new-model-name": ModelPricing(0.00125, 0.01, 300000, 150000),
        # ... existing models ...
    },
}

Provider name mapping:

  • "OpenAI" (not "openai")
  • "Anthropic" (not "claude")
  • "Google" (not "gemini")
  • "xAI" (not "grok")

Step 3: Update Capabilities (if new features)

If the model introduces new capabilities:

supported_capabilities={
    "web_search",
    "code_execution",
    "new_capability",  # Add here
}

Step 4: Update Default Model (if appropriate)

Only change if the new model should be the recommended default:

default_model="new-model-name"

Step 5: Validate and Test

# Run capabilities tests
uv run pytest massgen/tests/test_backend_capabilities.py -v

# Test config generation with new model
massgen --generate-config ./test.yaml --config-backend openai --config-model new-model-name

# Verify the config was created successfully
cat ./test.yaml

Step 6: Regenerate Documentation

uv run python docs/scripts/generate_backend_tables.py
cd docs && make html

Current Model Data

OpenAI Models (as of Nov 2025)

In capabilities.py:

models=[
    "gpt-5.1",        # 2025-11
    "gpt-5-codex",    # 2025-09
    "gpt-5",          # 2025-08
    "gpt-5-mini",     # 2025-08
    "gpt-5-nano",     # 2025-08
    "gpt-4.1",        # 2025-04
    "gpt-4.1-mini",   # 2025-04
    "gpt-4.1-nano",   # 2025-04
    "gpt-4o",         # 2024-05
    "gpt-4o-mini",    # 2024-07
    "o4-mini",        # 2025-04
]

In token_manager.py (add missing models):

"OpenAI": {
    "gpt-5": ModelPricing(0.00125, 0.01, 400000, 128000),
    "gpt-5-mini": ModelPricing(0.00025, 0.002, 400000, 128000),
    "gpt-5-nano": ModelPricing(0.00005, 0.0004, 400000, 128000),
    "gpt-4o": ModelPricing(0.0025, 0.01, 128000, 16384),
    "gpt-4o-mini": ModelPricing(0.00015, 0.0006, 128000, 16384),
    # Missing: gpt-5.1, gpt-5-codex, gpt-4.1 family, o4-mini
}

Claude Models (as of Nov 2025)

In capabilities.py:

models=[
    "claude-haiku-4-5-20251001",    # 2025-10
    "claude-sonnet-4-5-20250929",   # 2025-09
    "claude-opus-4-1-20250805",     # 2025-08
    "claude-sonnet-4-20250514",     # 2025-05
]

In token_manager.py:

"Anthropic": {
    "claude-haiku-4-5": ModelPricing(0.001, 0.005, 200000, 65536),
    "claude-sonnet-4-5": ModelPricing(0.003, 0.015, 200000, 65536),
    "claude-opus-4.1": ModelPricing(0.015, 0.075, 200000, 32768),
    "claude-sonnet-4": ModelPricing(0.003, 0.015, 200000, 8192),
}

Gemini Models (as of Nov 2025)

In capabilities.py:

models=[
    "gemini-3-pro-preview",  # 2025-11
    "gemini-2.5-flash",      # 2025-06
    "gemini-2.5-pro",        # 2025-06
]

In token_manager.py (missing gemini-2.5 and gemini-3):

"Google": {
    "gemini-1.5-pro": ModelPricing(0.00125, 0.005, 2097152, 8192),
    "gemini-1.5-flash": ModelPricing(0.000075, 0.0003, 1048576, 8192),
    # Missing: gemini-2.5-pro, gemini-2.5-flash, gemini-3-pro-preview
}

Grok Models (as of Nov 2025)

In capabilities.py:

models=[
    "grok-4-1-fast-reasoning",      # 2025-11
    "grok-4-1-fast-non-reasoning",  # 2025-11
    "grok-code-fast-1",             # 2025-08
    "grok-4",                       # 2025-07
    "grok-4-fast",                  # 2025-09
    "grok-3",                       # 2025-02
    "grok-3-mini",                  # 2025-05
]

In token_manager.py (missing grok-3, grok-4 families):

"xAI": {
    "grok-2-latest": ModelPricing(0.005, 0.015, 131072, 131072),
    "grok-2": ModelPricing(0.005, 0.015, 131072, 131072),
    "grok-2-mini": ModelPricing(0.001, 0.003, 131072, 65536),
    # Missing: grok-3, grok-4, grok-4-1 families
}

Model Name Matching

Important: The names in PROVIDER_PRICING use simplified patterns:

  • "gpt-5" matches gpt-5, gpt-5-preview, gpt-5-*
  • "claude-sonnet-4-5" matches claude-sonnet-4-5-* (any date suffix)
  • "gemini-2.5-pro" is exact match

The token manager uses prefix matching for flexibility.

Common Tasks

Task: Add brand new GPT-5.2 model

  1. Research: Release date, pricing, context window, capabilities
  2. Add to capabilities.py models list and release_dates
  3. Add to token_manager.py PROVIDER_PRICING["OpenAI"]
  4. Run tests
  5. Regenerate docs

Task: Update pricing for existing model

  1. Verify new pricing from official source
  2. Update only token_manager.py PROVIDER_PRICING
  3. No need to touch capabilities.py
  4. Document change in notes if significant

Task: Add new capability to model

  1. Update supported_capabilities in capabilities.py
  2. Add to notes explaining when/how capability works
  3. Update backend implementation if needed
  4. Run tests

Validation Commands

# Test capabilities registry
uv run pytest massgen/tests/test_backend_capabilities.py -v

# Test token manager
uv run pytest massgen/tests/test_token_manager.py -v

# Generate config with new model
massgen --generate-config ./test.yaml --config-backend openai --config-model new-model

# Build docs to verify tables
cd docs && make html

Programmatic Model Updates

LiteLLM Pricing Database (RECOMMENDED)

The easiest way to get comprehensive model pricing and context window data:

URL: https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json

Coverage: 500+ models across 30+ providers including:

  • OpenAI, Anthropic, Google, xAI
  • Together AI, Groq, Cerebras, Fireworks
  • AWS Bedrock, Azure, Cohere, and more

Data Available:

{
  "gpt-4o": {
    "input_cost_per_token": 0.0000025,
    "output_cost_per_token": 0.00001,
    "max_input_tokens": 128000,
    "max_output_tokens": 16384,
    "supports_vision": true,
    "supports_function_calling": true,
    "supports_prompt_caching": true
  }
}

Usage:

import requests

# Fetch latest pricing
url = "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
pricing_db = requests.get(url).json()

# Get info for a model
model_info = pricing_db.get("gpt-4o")
input_per_1k = model_info["input_cost_per_token"] * 1000
output_per_1k = model_info["output_cost_per_token"] * 1000

Update token_manager.py from LiteLLM:

  • Convert per-token costs to per-1K costs
  • Extract context window and max output tokens
  • Keep models in reverse chronological order

OpenRouter API (Real-Time)

For the most up-to-date model list with live pricing:

Endpoint: https://openrouter.ai/api/v1/models

Data Available:

  • Real-time pricing (prompt, completion, reasoning, caching)
  • Context windows and max completion tokens
  • Model capabilities and modalities
  • 200+ models from multiple providers

Usage:

import requests
import os

headers = {"Authorization": f"Bearer {os.environ['OPENROUTER_API_KEY']}"}
response = requests.get("https://openrouter.ai/api/v1/models", headers=headers)
models = response.json()["data"]

for model in models:
    print(f"{model['id']}: ${model['pricing']['prompt']} input, ${model['pricing']['completion']} output")

Provider-Specific APIs

Provider Models API Pricing in API? Recommendation
OpenAI https://api.openai.com/v1/models ❌ No Use LiteLLM
Claude No public API ❌ No Use LiteLLM
Gemini https://generativelanguage.googleapis.com/v1beta/models ❌ No API + LiteLLM
Grok (xAI) https://api.x.ai/v1/models ❌ No Use LiteLLM
Together AI https://api.together.xyz/v1/models ✅ Yes API directly
Groq https://api.groq.com/openai/v1/models ❌ No Use LiteLLM
Cerebras https://api.cerebras.ai/v1/models ❌ No Use LiteLLM
Fireworks https://api.fireworks.ai/v1/accounts/{id}/models ❌ No Use LiteLLM
Azure OpenAI Azure Management API ❌ Complex Manual
Claude Code No API ❌ No Manual

Automation Script

Create scripts/update_model_pricing.py to automate updates:

#!/usr/bin/env python3
"""Update token_manager.py pricing from LiteLLM database."""

import requests

# Fetch LiteLLM database
url = "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
pricing_db = requests.get(url).json()

# Filter by provider
openai_models = {k: v for k, v in pricing_db.items()
                 if v.get("litellm_provider") == "openai"}
anthropic_models = {k: v for k, v in pricing_db.items()
                    if v.get("litellm_provider") == "anthropic"}

# Generate ModelPricing entries
for model_name, info in openai_models.items():
    input_per_1k = info["input_cost_per_token"] * 1000
    output_per_1k = info["output_cost_per_token"] * 1000
    context = info.get("max_input_tokens", 0)
    max_output = info.get("max_output_tokens", 0)

    print(f'    "{model_name}": ModelPricing({input_per_1k}, {output_per_1k}, {context}, {max_output}),')

Run weekly to keep pricing current:

uv run python scripts/update_model_pricing.py

Reference Files

Important Maintenance Notes

  • Keep models in reverse chronological order - Newest first
  • Use exact API names - Match provider documentation exactly
  • Verify pricing units - Always per 1K tokens in token_manager.py
  • Document uncertainties - If info is estimated/unofficial, note it
  • Update both files - Don't forget token_manager.py when adding models
  • Use LiteLLM for pricing - Comprehensive and frequently updated
  • Test after updates - Run pytest to verify no breaking changes
Weekly Installs
5
First Seen
Jan 24, 2026
Installed on
antigravity4
windsurf4
claude-code4
codex4
trae4
opencode4