image-generation

Installation
SKILL.md

Image Generation

Generate and edit images using AI models. The script automatically picks a backend based on which API keys are configured — you don't need to specify a model unless the user explicitly names one.

Supported models (passed via model only when the user asks for a specific one):

  • OpenAIgpt-image-2, gpt-image-1
  • Gemini Nano Banananano-banana-2, nano-banana-pro, nano-banana
  • Seedream (Volcengine Ark)seedream-5.0-lite, seedream-4.5
  • Qwen (DashScope)qwen-image-2.0, qwen-image-2.0-pro
  • MiniMaximage-01

Usage

Run scripts/generate.py with a JSON argument. The path is relative to this skill's base_dir.

python <base_dir>/scripts/generate.py '<json_args>'

Set bash timeout to at least 600 seconds, as image generation can take 30–200s per provider, and the script may try multiple providers sequentially.

Parameters

Parameter Type Required Default Description
prompt string yes Image description
image_url string / list no null Input image(s) for editing: local file path or URL. Multi-image fusion is supported (pass a list)
quality string no auto low / medium / high (only some backends honour this)
size string no auto 512 / 1K / 2K / 3K / 4K, or pixel value (1024x1024)
aspect_ratio string no null 1:1 / 3:2 / 2:3 / 16:9 / 9:16 / 21:9 (some backends also support extreme ratios like 1:4 / 8:1)

Higher quality and larger size cost more and run slower. In normal cases, when the user does not explicitly specify, low or medium is sufficient. Only use high when the user asks for it.

Example — generate

python <base_dir>/scripts/generate.py '{"prompt": "A corgi astronaut floating in space"}'

With aspect ratio:

python <base_dir>/scripts/generate.py '{"prompt": "Isometric miniature city of Shanghai at sunset", "size": "2K", "aspect_ratio": "16:9"}'

Important: Editing vs Generating

When the user asks to edit, modify, or improve an existing image, pass the original image via image_url. Prefer local file paths directly — the script handles file reading internally. Without image_url, the script generates a brand-new image instead of editing.

Example — edit (image-to-image)

python <base_dir>/scripts/generate.py '{"prompt": "Add a Santa hat to the dog", "image_url": "/path/to/dog.png"}'

Multi-image fusion — pass a list:

python <base_dir>/scripts/generate.py '{"prompt": "Combine these characters into a group photo", "image_url": ["/path/a.png", "/path/b.png"]}'

Output

Prints JSON to stdout:

{
  "model": "doubao-seedream-5-0-260128",
  "images": [
    {"url": "/path/to/output.png"}
  ]
}

After success, display the image to the user. You can either embed it in markdown (![description](/path/to/output.png)) or use the send tool.

On error:

{
  "error": "error message"
}

Setup

The script needs at least one of these API keys (set via env_config or config.json):

OPENAI_API_KEY / GEMINI_API_KEY / ARK_API_KEY / DASHSCOPE_API_KEY / MINIMAX_API_KEY / LINKAI_API_KEY

Each also has an optional *_API_BASE for custom endpoints. The script automatically picks the first configured backend and falls back to the next if it fails — no need to specify a model.

Error Handling

If the script returns an error after trying all configured backends, do NOT retry with the same parameters — the failure is almost always a configuration issue (wrong API key, unsupported API base). Tell the user to fix it via env_config, then retry.

Notes

  • HTTP timeout is 300s — high-resolution generation can take over 200s.
  • Omit quality / size to let the model pick automatically (auto).
  • Input images for editing are auto-compressed to ≤ 4MB / longest edge ≤ 4096px.
Installs
3
GitHub Stars
44.0K
First Seen
Apr 23, 2026