xai-models

SKILL.md

xAI Grok Models Guide

Complete guide to selecting the right Grok model for your use case, with pricing and capability comparisons.

Model Quick Reference

Model Best For Input $/1M Output $/1M Context
grok-4-1-fast Tool calling, agents $0.20 $0.50 2M
grok-4 Complex reasoning $3.00 $15.00 256K
grok-3-fast General tasks $0.20 $0.50 131K
grok-3-mini Lightweight tasks $0.30 $0.50 131K
grok-2-vision Image analysis $2.00 $10.00 32K

Model Selection Decision Tree

What's your primary need?
├─► Tool calling / Agent workflows
│   └─► grok-4-1-fast ($0.20/$0.50)
├─► Complex reasoning / Analysis
│   └─► grok-4 ($3.00/$15.00)
├─► General chat / Simple tasks
│   └─► grok-3-fast ($0.20/$0.50)
├─► High volume / Cost sensitive
│   └─► grok-3-mini ($0.30/$0.50)
└─► Image/Vision tasks
    └─► grok-2-vision ($2.00/$10.00)

Detailed Model Profiles

grok-4-1-fast (Recommended for Most Uses)

Best for: Tool calling, agentic workflows, real-time search

# Best choice for X search and sentiment analysis
response = client.chat.completions.create(
    model="grok-4-1-fast",
    messages=[{"role": "user", "content": "Search X for AAPL sentiment"}]
)

Features:

  • 2 million token context window
  • Optimized for tool calling
  • Fast response times
  • Best price/performance ratio

Variants:

  • grok-4-1-fast-reasoning - Maximum intelligence
  • grok-4-1-fast-non-reasoning - Instant responses

grok-4

Best for: Deep analysis, complex reasoning, research

# Use for complex multi-step analysis
response = client.chat.completions.create(
    model="grok-4",
    messages=[{"role": "user", "content": "Analyze market trends..."}]
)

Features:

  • Highest reasoning capability
  • Best for complex tasks
  • 256K context window

grok-3-fast

Best for: General purpose, balanced performance

# Good default choice for most tasks
response = client.chat.completions.create(
    model="grok-3-fast",
    messages=[{"role": "user", "content": "Summarize this..."}]
)

Features:

  • Fast responses
  • 131K context
  • Good balance of speed/quality

grok-3-mini

Best for: High-volume, cost-sensitive applications

# Use for bulk processing
response = client.chat.completions.create(
    model="grok-3-mini",
    messages=[{"role": "user", "content": "Classify: ..."}]
)

Features:

  • Lowest latency
  • Most cost-effective
  • Good for simple tasks

grok-2-vision

Best for: Image analysis, charts, screenshots

import base64

# Encode image
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="grok-2-vision",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Analyze this chart"},
            {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{image_data}"}}
        ]
    }]
)

Cost Optimization Strategies

1. Use the Right Model

# For filtering/classification - use mini
filter_response = client.chat.completions.create(
    model="grok-3-mini",
    messages=[{"role": "user", "content": f"Is this relevant? {text}"}]
)

# For analysis - use fast
if is_relevant:
    analysis = client.chat.completions.create(
        model="grok-4-1-fast",
        messages=[{"role": "user", "content": f"Analyze: {text}"}]
    )

2. Leverage Caching

Cached input tokens are 75% cheaper:

  • Regular: $0.20/1M
  • Cached: $0.05/1M

3. Batch Similar Requests

# Instead of 10 separate calls, batch them
texts = ["text1", "text2", "text3"]
batch_prompt = "Analyze these texts:\n" + "\n".join(texts)

response = client.chat.completions.create(
    model="grok-3-fast",
    messages=[{"role": "user", "content": batch_prompt}]
)

Tool Calling Costs

Tool Cost per 1,000 calls
X Search $5.00
Web Search $5.00
Code Execution $5.00
Document Search $2.50

Context Window Comparison

Model Context Pages of Text Hours of Audio
grok-4-1-fast 2M ~6,000 ~50
grok-4 256K ~800 ~6
grok-3-fast 131K ~400 ~3
grok-2-vision 32K ~100 ~1

Model Capabilities Matrix

Capability 4.1 Fast 4 3 Fast 3 Mini 2 Vision
Tool Calling ⭐⭐⭐ ⭐⭐
Reasoning ⭐⭐ ⭐⭐⭐ ⭐⭐ ⭐⭐
Speed ⭐⭐⭐ ⭐⭐⭐ ⭐⭐⭐ ⭐⭐
Cost ⭐⭐⭐ ⭐⭐⭐ ⭐⭐⭐ ⭐⭐
Vision ⭐⭐⭐
X Search ⭐⭐⭐ ⭐⭐ ⭐⭐

Recommended Configurations

Financial Sentiment Pipeline

MODELS = {
    "filter": "grok-3-mini",      # Fast filtering
    "analyze": "grok-4-1-fast",   # Tool calling + analysis
    "deep": "grok-4"              # Complex reasoning (rare)
}

High-Volume Processing

MODELS = {
    "bulk": "grok-3-mini",
    "quality_check": "grok-3-fast"
}

Research & Analysis

MODELS = {
    "search": "grok-4-1-fast",
    "analyze": "grok-4",
    "summarize": "grok-3-fast"
}

API Usage Example

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("XAI_API_KEY"),
    base_url="https://api.x.ai/v1"
)

# List available models
models = client.models.list()
for model in models.data:
    print(f"{model.id}")

# Use specific model
response = client.chat.completions.create(
    model="grok-4-1-fast",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=100
)

Related Skills

  • xai-auth - Authentication setup
  • xai-agent-tools - Tool calling
  • xai-sentiment - Sentiment analysis

References

Weekly Installs
3
Installed on
claude-code3
opencode2
kilo1
windsurf1
zencoder1
cline1