gpt-image-2
Installation
SKILL.md
GPT Image 2
A single Python entrypoint that covers every GPT Image 2 route, with strict pre-flight validation of the model's size, aspect, and feature constraints.
Workflow
- Open references/config.md to pick environment variables and defaults.
- Open references/api-surface.md to choose between
generations,edits, andresponses. - Prefer
OPENAI_BASE_URL=https://api.openai.com/v1unless the user asks for a different OpenAI-compatible endpoint. - Use
gpt-image-2forgenerationsandedits; use a text-capable Responses model such asgpt-5.4forresponses. - Run
scripts/gpt_image.pywith one of the three subcommands. - Add
--dry-runfirst when the payload shape is the main risk. - Add
--save-response <path>when the raw JSON body or SSE event stream needs to be kept for debugging.
Commands
Text-to-image through the public Images API:
python .\skills\gpt-image-2\scripts\gpt_image.py generations `
--prompt "A bold product hero image for a developer tool homepage" `
--output .\out\hero.png `
--size 1536x1024 `
--quality high `
--format png
Multi-image batch with a filename pattern:
python .\skills\gpt-image-2\scripts\gpt_image.py generations `
--prompt "A cinematic city skyline at night" `
--output .\out\skyline-{index}.webp `
--n 3 `
--format webp `
--compression 90
Image edits with two inputs plus a mask:
python .\skills\gpt-image-2\scripts\gpt_image.py edits `
--prompt "Blend the two references into one clean marketing illustration" `
--image .\refs\subject.png `
--image .\refs\background.png `
--mask .\refs\mask.png `
--output .\out\edit-{index}.png `
--image-field-style brackets `
--n 2
Responses API with streaming and partial previews:
python .\skills\gpt-image-2\scripts\gpt_image.py responses `
--input-text "Generate a poster for an AI developer summit" `
--model gpt-5.4 `
--output .\out\poster-{index}.png `
--stream `
--partial-images 2 `
--save-response .\out\poster-events.json
Responses API edit with a local image plus a mask:
python .\skills\gpt-image-2\scripts\gpt_image.py responses `
--input-text "Turn this product shot into a clean studio ad" `
--model gpt-5.4 `
--input-image .\refs\product.png `
--mask .\refs\mask.png `
--output .\out\studio.png `
--action edit
Inspect the built request without sending it:
python .\skills\gpt-image-2\scripts\gpt_image.py generations `
--prompt "A minimal cover image" `
--output .\out\cover.png `
--dry-run
Rules
- Use
generationsfor public text-to-image calls. - Use
editsfor multipart image edits and mask uploads. - Use
responsesfor advanced flows: streaming, mixed text + image input,previous_response_id,tool_choice,action, and optionaltool_model. - Process environment variables override
.env; CLI flags override both. - Never print secrets.
--outputtakes either a single path or a pattern such asimage-{index}.pngfor multi-image or streaming flows.responsesuses a top-level Responses model separate from the image model; default it togpt-5.4unless you need another text-capable model.qualityon Responses tool flows is passed through, but final behavior still depends on the hosted image tool.- On OpenAI GPT image models, omit
response_format; image data already comes back as base64. - Fail fast on unsupported
gpt-image-2combinations: transparent background, invalid size,partial_imagesoutside0..3, orstream=truewithn>1on public Images routes.
Resources
- Script: scripts/gpt_image.py
- Config reference: references/config.md
- API surface reference: references/api-surface.md