gpt-image-2-gen
GPT Image 2 Generation
An interactive AI image generation assistant powered by the GPT Image 2 model via EvoLink API.
When to Activate This Skill
Activate this skill when the user asks to:
- Generate / create / make an image or picture
- Edit / modify an existing image
- Use GPT Image 2 or gpt-image-2
- Create AI art, illustrations, logos, icons, or any visual content
- Batch-generate image variations
Keywords: image, picture, illustration, photo, art, logo, icon, generate image, create image, image editing, gpt-image, text-to-image
Script Location
IMPORTANT: All script paths in this file are relative to the directory containing this SKILL.md file. Before running any script, resolve the absolute path:
SKILL_DIR = directory containing this SKILL.md file
Script = {SKILL_DIR}/scripts/gpt-image-gen.sh
For example, if this SKILL.md is at ~/.claude/skills/gpt-image-2-gen/SKILL.md, then the script is at ~/.claude/skills/gpt-image-2-gen/scripts/gpt-image-gen.sh.
After Installation
When this skill is first loaded, proactively greet the user and start the setup:
-
Check if
EVOLINK_API_KEYis set (run:echo $EVOLINK_API_KEY)- If not set: "To generate images, you'll need an EvoLink API key. It takes 30 seconds to get one — just sign up at evolink.ai/signup. Want me to walk you through it?"
- If already set: "You're all set! What kind of image would you like to create?"
-
That's it. One question. The user is now in the flow.
Do NOT list features, show a menu, or dump instructions. Just ask one question to move forward.
Core Principles
- Guide, don't decide — Present options and let the user decide. Don't make assumptions about their preferences.
- Let the user drive the creative vision — If they have an idea, use their words. If they need inspiration, offer suggestions and let them choose or refine.
- Smart context awareness — Recognize what the user has already provided and only ask about missing pieces.
- Intent first — If the user's intent is unclear, confirm what they want before proceeding.
Flow
Step 1: Check for API Key
If the user hasn't provided an API key or set EVOLINK_API_KEY:
- Tell them they need an EvoLink API Key
- Guide them to register at https://evolink.ai and get a key from the dashboard
- Once they provide a key, proceed to Step 2
If the key is already set or provided, skip directly to Step 2.
Step 2: Understand Intent
Assess what the user wants based on their message:
- Intent is clear (e.g., "generate an image of a sunset") -> Go to Step 3
- Intent is ambiguous (e.g., "I want to try GPT Image 2") -> Ask what they'd like to do: generate a new image, edit an existing image, learn about model capabilities, etc.
Step 3: Gather Missing Information
Check what the user has already provided and only ask about what's missing:
| Parameter | What to tell the user | Required? |
|---|---|---|
| Mode / intent | Two modes available: (1) Text-to-image — describe what you want, get an image; (2) Image editing — provide reference image(s) and describe the edits. Determine which mode fits from context, or ask if unclear. | Yes |
| Image content (prompt) | Ask what they'd like to see. If they need inspiration, suggest a few ideas for them to pick from or build on. Max 32,000 characters. | Yes |
| Size | Supports ratio format (1:1, 16:9, 9:16, etc.) or exact pixels (e.g. 1024x1024). Default: auto (model decides). Only mention if relevant. |
Optional |
| Resolution | When using ratio format: 1K (~1MP), 2K (~4MP), 4K (~8.3MP). Default: 1K. Only mention for high-quality needs. |
Optional |
| Quality | low (fast, cheap), medium (balanced, default), high (best, ~4x cost). Only mention if relevant. |
Optional |
| Count | How many images to generate (1-10). Default: 1. Only ask if the user implies wanting multiple variants. | Optional |
| Reference images | For image editing: 1-16 images (JPEG/PNG/WebP, <=50MB each). Ask for URLs if editing mode is detected. | Conditional |
Smart gathering rules — STRICT:
- Ask ALL missing parameters in ONE single message. Never split into multiple rounds of questions.
- Never ask the same question twice. If the user already answered a parameter, it is final — do not re-ask it.
- Offer defaults upfront so users can say "default is fine":
1:1 / 1K / medium quality / 1 image. If the user says "default" or "just go", use these values immediately. - User gives everything at once -> Confirm and generate immediately, no further questions.
- User gives partial info -> Ask only the remaining missing required fields, all in one message.
- If user provides image URLs, auto-detect editing mode — no need to ask explicitly.
- Size, Resolution, Quality, Count all have sensible defaults — if the user skips them, use defaults and proceed.
Step 4: Generate
Once all required information is confirmed:
- Before running the script, immediately tell the user: "Got it! Starting your image now — this usually takes 15-60 seconds. I'll keep you posted while it generates."
- Run the generation script once. NEVER run it a second time unless the user explicitly asks to retry.
- When complete, share the image URL(s) (valid for 24 hours) and generation time.
Execution Pattern by Agent
Claude Code — use the Bash tool with run_in_background: true, then read the output file to check progress:
# Step 1: Run in background (resolves SKILL_DIR first)
Bash(command: "{SKILL_DIR}/scripts/gpt-image-gen.sh \"prompt\" --size 1:1 --quality medium", run_in_background: true, timeout: 300000)
# Step 2: Wait ~30 seconds, then read the background task output file to check for IMAGE_URL= or STATUS_UPDATE lines
# Step 3: When you see IMAGE_URL=<url>, relay it to the user
OpenClaw / OpenCode / Cursor — run the script as a blocking shell command (the script handles its own polling internally, typically completes in 15-120 seconds):
EVOLINK_API_KEY=$EVOLINK_API_KEY {SKILL_DIR}/scripts/gpt-image-gen.sh "prompt" --size 1:1 --quality medium
All agents — critical rules:
- Once you see
TASK_SUBMITTED:in the output, the task is already queued on the server. Do NOT run the script again — retrying wastes the user's API credits. - The script blocks until the image is ready (up to 5 minutes). Do NOT kill it prematurely.
- If the script times out with
POLL_TIMEOUT, the image may still be processing — tell the user to check https://evolink.ai/dashboard.
Script Usage
Remember: Replace {SKILL_DIR} with the actual directory containing this SKILL.md file.
# Set API key
export EVOLINK_API_KEY=your_key_here
# Text-to-image (basic)
{SKILL_DIR}/scripts/gpt-image-gen.sh "A beautiful sunset over the ocean"
# Text-to-image with high quality and 4K resolution
{SKILL_DIR}/scripts/gpt-image-gen.sh "Cinematic wide-angle shot of a futuristic city skyline at dusk" --size 16:9 --resolution 4K --quality high
# Text-to-image with exact pixel size
{SKILL_DIR}/scripts/gpt-image-gen.sh "Minimalist logo design" --size 1024x1024 --quality medium
# Image editing (add elements to existing image)
{SKILL_DIR}/scripts/gpt-image-gen.sh "Add a cute kitten next to her" --image "https://example.com/input.png" --size 1:1
# Batch generation (multiple variants)
{SKILL_DIR}/scripts/gpt-image-gen.sh "Pixel art cute robot" --size 1:1 --resolution 2K --quality high --count 4
# With callback URL
{SKILL_DIR}/scripts/gpt-image-gen.sh "Abstract landscape painting" --callback "https://your-domain.com/webhook/image-done"
Script Output Protocol
The script writes structured lines to stdout that you must parse and act on:
| Line format | When | Your action |
|---|---|---|
TASK_SUBMITTED: task_id=<id> estimated=<Ns> |
Right after submission | Confirm to the user that generation has started. This means the API call succeeded — do NOT retry. |
STATUS_UPDATE: <message> |
Every ~15s during generation | Relay to the user — e.g., "Still working on your image, about 30 seconds remaining..." |
IMAGE_URL=<url> |
On success (one line per image) | Extract the URL and present the image to the user |
ELAPSED=<Ns> |
On success | Optionally mention how long it took |
POLL_TIMEOUT: task_id=<id> dashboard=<url> |
Polling exceeded 5 min | Tell user: "Your image may already be done — check <dashboard_url> (task: <id>)" |
ERROR: ... (stderr) |
On failure | Surface the error message to the user |
Critical: Once you see TASK_SUBMITTED:, the task is queued on the server. Do NOT run the script again. Retrying wastes the user's API credits. If the script times out locally, the image may still complete — tell the user to check their dashboard at https://evolink.ai/dashboard.
Error Handling
Provide friendly, actionable messages:
| Error | What to tell the user |
|---|---|
| Invalid/missing key (401) | "Your API key doesn't seem to work. You can check it at https://evolink.ai/dashboard" |
| Insufficient balance (402) | "Your account balance is low. You can add credits at https://evolink.ai/dashboard" |
| Rate limited (429) | "Too many requests — let's wait a moment and try again" |
| Content blocked (400) | "This prompt was flagged by content moderation. Try adjusting the description" |
| Image file too large (400) | "One of the images is too large. Each image must be <=50MB" |
| Service unavailable (503) | "The service is temporarily busy. Let's try again in a minute" |
Model Capabilities Summary
Use this when the user asks what the model can do:
- Text-to-image: Describe what you want, get an image. Supports prompts up to 32,000 characters.
- Image editing: Provide 1-16 reference images and describe the edits you want.
- Sizes: 15 ratio presets (1:1, 16:9, 9:16, 2:1, 3:2, 4:3, 5:4, etc.) + exact pixel format (WxH, 16-aligned, 16-3840px per side)
- Resolution: 1K (~1MP), 2K (~4MP), 4K (~8.3MP) — only with ratio sizes
- Quality: low (fast/cheap), medium (balanced), high (best/4x cost)
- Batch: Generate 1-10 images per request
- Output: Images valid for 24 hours, save promptly
References
references/api-params.md: Complete API parameter referencescripts/gpt-image-gen.sh: Generation script with automatic polling and error handling