image-generate
Image Generation Skill
Generate images from text descriptions using AI.
When to Use
✅ USE this skill when:
- User asks to "create an image of..."
- "Generate a picture showing..."
- Need illustrations for content
- Visual brainstorming or concept art
When NOT to Use
❌ DON'T use this skill when:
- Editing existing images → use image editing tools
- Analyzing image content → use vision/OCR tools
- Creating diagrams/charts → use charting libraries
- Text-to-speech → use TTS services
Setup
Requires OPENAI_API_KEY environment variable.
export OPENAI_API_KEY="sk-..."
API Usage
Basic Image Generation
curl -X POST "https://api.openai.com/v1/images/generations" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "dall-e-3",
"prompt": "A sunset over mountains",
"n": 1,
"size": "1024x1024"
}'
Response Format
{
"created": 1234567890,
"data": [
{
"url": "https://...",
"revised_prompt": "..."
}
]
}
Size Options
1024x1024(default)1024x1792(portrait)1792x1024(landscape)
Quality Options
standard(default, faster)hd(higher quality, slower)
Style Options
vivid(default, dramatic)natural(photorealistic)
Advanced Options
With Custom Parameters
curl -X POST "https://api.openai.com/v1/images/generations" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "dall-e-3",
"prompt": "A cyberpunk city at night",
"n": 1,
"size": "1792x1024",
"quality": "hd",
"style": "vivid"
}'
Multiple Variations (DALL-E 2)
# First generate the base image
BASE_RESPONSE=$(curl -X POST "https://api.openai.com/v1/images/generations" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "dall-e-2",
"prompt": "A cat",
"n": 1,
"size": "1024x1024"
}')
# Then create variations (requires image URL from first response)
curl -X POST "https://api.openai.com/v1/images/variations" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-F "image=@image.png" \
-F "n=4" \
-F "size=1024x1024"
Node.js Implementation
const fetch = require('node-fetch');
async function generateImage(prompt, options = {}) {
const response = await fetch('https://api.openai.com/v1/images/generations', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: options.model || 'dall-e-3',
prompt: prompt,
n: options.count || 1,
size: options.size || '1024x1024',
quality: options.quality || 'standard',
style: options.style || 'vivid'
})
});
const data = await response.json();
if (!response.ok) {
throw new Error(data.error?.message || 'Image generation failed');
}
return {
urls: data.data.map(img => img.url),
revisedPrompts: data.data.map(img => img.revised_prompt),
created: data.created
};
}
// Download image helper
async function downloadImage(url, outputPath) {
const fs = require('fs');
const response = await fetch(url);
const buffer = await response.buffer();
fs.writeFileSync(outputPath, buffer);
return outputPath;
}
// Usage
const result = await generateImage('A futuristic city with flying cars', {
size: '1792x1024',
quality: 'hd',
style: 'vivid'
});
console.log('Generated:', result.urls[0]);
await downloadImage(result.urls[0], '/tmp/generated.png');
Prompt Tips
Good Prompts
- Be specific: "A red apple on a wooden table" vs "apple"
- Include style: "photorealistic", "oil painting", "digital art"
- Set the mood: "sunset lighting", "dramatic shadows"
- Define composition: "close-up", "wide angle", "portrait"
Prompt Template
[subject] + [action/state] + [environment] + [lighting] + [style]
Example: "A wise owl (subject) perched on a branch (action)
in an enchanted forest (environment) with moonlight filtering
through leaves (lighting), digital painting style (style)"
Avoid
- Too many subjects (keep it focused)
- Contradictory descriptions
- Overly complex scenes
- Text within images (DALL-E struggles with text)
Error Handling
async function safeGenerateImage(prompt) {
try {
const result = await generateImage(prompt);
return { success: true, ...result };
} catch (error) {
if (error.message.includes('content_policy')) {
return { success: false, error: 'Content violated policy' };
}
if (error.message.includes('rate_limit')) {
return { success: false, error: 'Rate limit exceeded' };
}
return { success: false, error: error.message };
}
}
Cost Reference
- DALL-E 3 Standard: ~$0.040/image (1024x1024)
- DALL-E 3 HD: ~$0.080/image (1024x1024)
- DALL-E 2: ~$0.020/image (1024x1024)
Quick Response Template
"Generate an image of [X]"
const result = await generateImage(prompt, { size: '1024x1024' });
return `🎨 **Image Generated**
**Prompt:** ${prompt}
**Revised:** ${result.revisedPrompts[0]}

[Download](${result.urls[0]})
`;
Notes
- DALL-E 3 takes ~10-30 seconds per image
- Images expire after ~1 hour — download immediately
- Content policy restrictions apply (no violence, celebrities, etc.)
- 4096 characters max prompt length
More from winsorllc/upgraded-carnival
vector-memory
Vector-based semantic memory using embeddings for intelligent recall. Store and search memories by meaning rather than keywords. Use when you need semantic search, similar document retrieval, or context-aware memory.
132model-router
Route requests between different LLM providers and models. Configure routing rules, fallback providers, and model-specific parameters inspired by ZeroClaw and OpenClaw model routing systems.
63rss-monitor
Monitor RSS/Atom feeds and blogs for new content using feedparser.
60rss-reader
Read and parse RSS/Atom feeds. Use when: user wants to subscribe to feeds, get latest articles, or monitor news sources.
55video-frames
Production-grade video frame extraction with thumbnail grids, GIF creation, and batch frame processing. Includes intelligent quality presets, progress tracking, and comprehensive error handling.
39elevenlabs-tts
Convert text to speech using ElevenLabs API. Use when you need to generate voice audio for messages, narrations, or accessibility.
25