xai-models
SKILL.md
xAI Grok Models Guide
Complete guide to selecting the right Grok model for your use case, with pricing and capability comparisons.
Model Quick Reference
| Model | Best For | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
grok-4-1-fast |
Tool calling, agents | $0.20 | $0.50 | 2M |
grok-4 |
Complex reasoning | $3.00 | $15.00 | 256K |
grok-3-fast |
General tasks | $0.20 | $0.50 | 131K |
grok-3-mini |
Lightweight tasks | $0.30 | $0.50 | 131K |
grok-2-vision |
Image analysis | $2.00 | $10.00 | 32K |
Model Selection Decision Tree
What's your primary need?
│
├─► Tool calling / Agent workflows
│ └─► grok-4-1-fast ($0.20/$0.50)
│
├─► Complex reasoning / Analysis
│ └─► grok-4 ($3.00/$15.00)
│
├─► General chat / Simple tasks
│ └─► grok-3-fast ($0.20/$0.50)
│
├─► High volume / Cost sensitive
│ └─► grok-3-mini ($0.30/$0.50)
│
└─► Image/Vision tasks
└─► grok-2-vision ($2.00/$10.00)
Detailed Model Profiles
grok-4-1-fast (Recommended for Most Uses)
Best for: Tool calling, agentic workflows, real-time search
# Best choice for X search and sentiment analysis
response = client.chat.completions.create(
model="grok-4-1-fast",
messages=[{"role": "user", "content": "Search X for AAPL sentiment"}]
)
Features:
- 2 million token context window
- Optimized for tool calling
- Fast response times
- Best price/performance ratio
Variants:
grok-4-1-fast-reasoning- Maximum intelligencegrok-4-1-fast-non-reasoning- Instant responses
grok-4
Best for: Deep analysis, complex reasoning, research
# Use for complex multi-step analysis
response = client.chat.completions.create(
model="grok-4",
messages=[{"role": "user", "content": "Analyze market trends..."}]
)
Features:
- Highest reasoning capability
- Best for complex tasks
- 256K context window
grok-3-fast
Best for: General purpose, balanced performance
# Good default choice for most tasks
response = client.chat.completions.create(
model="grok-3-fast",
messages=[{"role": "user", "content": "Summarize this..."}]
)
Features:
- Fast responses
- 131K context
- Good balance of speed/quality
grok-3-mini
Best for: High-volume, cost-sensitive applications
# Use for bulk processing
response = client.chat.completions.create(
model="grok-3-mini",
messages=[{"role": "user", "content": "Classify: ..."}]
)
Features:
- Lowest latency
- Most cost-effective
- Good for simple tasks
grok-2-vision
Best for: Image analysis, charts, screenshots
import base64
# Encode image
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
model="grok-2-vision",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Analyze this chart"},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{image_data}"}}
]
}]
)
Cost Optimization Strategies
1. Use the Right Model
# For filtering/classification - use mini
filter_response = client.chat.completions.create(
model="grok-3-mini",
messages=[{"role": "user", "content": f"Is this relevant? {text}"}]
)
# For analysis - use fast
if is_relevant:
analysis = client.chat.completions.create(
model="grok-4-1-fast",
messages=[{"role": "user", "content": f"Analyze: {text}"}]
)
2. Leverage Caching
Cached input tokens are 75% cheaper:
- Regular: $0.20/1M
- Cached: $0.05/1M
3. Batch Similar Requests
# Instead of 10 separate calls, batch them
texts = ["text1", "text2", "text3"]
batch_prompt = "Analyze these texts:\n" + "\n".join(texts)
response = client.chat.completions.create(
model="grok-3-fast",
messages=[{"role": "user", "content": batch_prompt}]
)
Tool Calling Costs
| Tool | Cost per 1,000 calls |
|---|---|
| X Search | $5.00 |
| Web Search | $5.00 |
| Code Execution | $5.00 |
| Document Search | $2.50 |
Context Window Comparison
| Model | Context | Pages of Text | Hours of Audio |
|---|---|---|---|
| grok-4-1-fast | 2M | ~6,000 | ~50 |
| grok-4 | 256K | ~800 | ~6 |
| grok-3-fast | 131K | ~400 | ~3 |
| grok-2-vision | 32K | ~100 | ~1 |
Model Capabilities Matrix
| Capability | 4.1 Fast | 4 | 3 Fast | 3 Mini | 2 Vision |
|---|---|---|---|---|---|
| Tool Calling | ⭐⭐⭐ | ⭐⭐ | ⭐ | ⭐ | ❌ |
| Reasoning | ⭐⭐ | ⭐⭐⭐ | ⭐⭐ | ⭐ | ⭐⭐ |
| Speed | ⭐⭐⭐ | ⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
| Cost | ⭐⭐⭐ | ⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
| Vision | ❌ | ❌ | ❌ | ❌ | ⭐⭐⭐ |
| X Search | ⭐⭐⭐ | ⭐⭐ | ⭐⭐ | ⭐ | ❌ |
Recommended Configurations
Financial Sentiment Pipeline
MODELS = {
"filter": "grok-3-mini", # Fast filtering
"analyze": "grok-4-1-fast", # Tool calling + analysis
"deep": "grok-4" # Complex reasoning (rare)
}
High-Volume Processing
MODELS = {
"bulk": "grok-3-mini",
"quality_check": "grok-3-fast"
}
Research & Analysis
MODELS = {
"search": "grok-4-1-fast",
"analyze": "grok-4",
"summarize": "grok-3-fast"
}
API Usage Example
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1"
)
# List available models
models = client.models.list()
for model in models.data:
print(f"{model.id}")
# Use specific model
response = client.chat.completions.create(
model="grok-4-1-fast",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=100
)
Related Skills
xai-auth- Authentication setupxai-agent-tools- Tool callingxai-sentiment- Sentiment analysis
References
Weekly Installs
3
Repository
adaptationio/skrillzInstalled on
claude-code3
opencode2
kilo1
windsurf1
zencoder1
cline1