gpt-image-2-skill
GPT Image 2 Skill
Skill by ara.so — Daily 2026 Skills collection.
A prompt gallery, CLI, and agentic skill for OpenAI's gpt-image-2 model. Provides 162 curated prompts across categories (research figures, UI mockups, typography, photography, anime, maps, product shots), a full-featured CLI, and skill integrations for Claude Code, Codex, and other agent runtimes.
Install
CLI (fastest)
# Run without installing
uvx --from git+https://github.com/wuyoscar/gpt_image_2_skill gpt-image -p "a cat astronaut"
# Install to PATH permanently
uv tool install git+https://github.com/wuyoscar/gpt_image_2_skill
gpt-image -p "a cat astronaut"
Claude Code
/plugin marketplace add wuyoscar/gpt_image_2_skill
/plugin install gpt-image@wuyoscar-skills
Codex
$skill-installer install https://github.com/wuyoscar/gpt_image_2_skill/tree/main/skills/gpt-image
Manual agent-skill install
git clone https://github.com/wuyoscar/gpt_image_2_skill.git
cd gpt_image_2_skill
export AGENT_SKILLS_DIR="/path/to/your/agent/skills"
mkdir -p "$AGENT_SKILLS_DIR"
ln -s "$PWD/skills/gpt-image" "$AGENT_SKILLS_DIR/gpt-image"
Configuration
The CLI and skill read your OpenAI key from the environment or ~/.env:
export OPENAI_API_KEY="sk-..."
No other configuration is required.
CLI Reference
Text → Image (generation)
# Basic generation
gpt-image -p "a photorealistic convenience store at 10pm"
# With size, quality, and explicit output file
gpt-image -p "a neon-lit Tokyo alley at midnight" \
--size portrait --quality high -f tokyo-alley.png
# Square, low quality (cheap draft)
gpt-image -p "watercolor mountains at sunrise" \
--size 1k --quality low -f draft.png
# Batch: generates 4 variants, saved as out_0.png … out_3.png
gpt-image -p "product shot of a ceramic mug on white" \
--size square --quality medium -n 4 -f out.png
Text + Reference Image → Image (edit / restyle)
# Single reference restyle
gpt-image -p "Make it a winter evening with heavy snowfall" \
-i chess.png --quality high -f chess-winter.png
# Multi-reference composite: dog from image 2, scene from image 1
gpt-image -p "Place the dog from image 2 next to the woman in image 1. \
Match the same lighting, composition, and background." \
-i woman.png -i dog.png --size portrait --quality medium -f woman-with-dog.png
Mask-based Inpainting
# opaque pixels = keep, transparent pixels = regenerate
gpt-image -p "replace sky with aurora borealis" \
-i photo.jpg -m sky_mask.png -f aurora.png
Full Parameter Reference
| Flag | Values | Default | Notes |
|---|---|---|---|
-p, --prompt |
string | required | Full prompt text |
-f, --file |
path | auto-timestamped .png |
Output file path |
-i, --image |
path (repeatable) | — | Triggers /v1/images/edits; pass multiple for multi-ref |
-m, --mask |
path (PNG with alpha) | — | Requires -i; transparent = regenerate |
--size |
1k 2k 4k portrait landscape square wide tall or 1024x1024 |
1024x1024 |
Literals must be 16-px multiples, max edge 3840 |
--quality |
auto low medium high |
high |
Budget dial: low=drafts, high=final/text-heavy |
-n, --n |
int | 1 | Batch count; suffixes files _0, _1, … |
--background |
auto opaque |
API default | opaque disables transparency |
--moderation |
auto low |
low |
low for broader exploration |
--format |
png jpeg webp |
png |
Response encoding format |
--compression |
0–100 | — | JPEG/WebP only |
Exit codes: 0 success · 1 API/refusal error · 2 bad args or missing key
Python SDK Usage
Text → Image
from openai import OpenAI
client = OpenAI() # reads OPENAI_API_KEY from environment
result = client.images.generate(
model="gpt-image-2",
prompt="A photorealistic ceramic mug on a white studio background, "
"soft directional light, light shadow beneath",
size="1024x1024", # square
quality="high",
)
# Save result
import base64
from pathlib import Path
image_bytes = base64.b64decode(result.data[0].b64_json)
Path("mug.png").write_bytes(image_bytes)
print("Saved mug.png")
Portrait / Tall Generation
result = client.images.generate(
model="gpt-image-2",
prompt="Minimalist event poster: 'Boston Spring Jazz Festival · April 2026' "
"in bold serif, pastel cherry-blossom watercolor background, centered layout",
size="1024x1536", # portrait (3:4)
quality="high",
)
Image Edit (single reference)
result = client.images.edit(
model="gpt-image-2",
image=open("chess.png", "rb"),
prompt="Make it a winter evening with heavy snowfall, keep the chess pieces identical",
size="1024x1024",
quality="high",
)
Multi-Reference Edit
result = client.images.edit(
model="gpt-image-2",
image=[open("woman.png", "rb"), open("dog.png", "rb")],
prompt="Place the dog from image 2 next to the woman in image 1. "
"Match the same lighting, composition, and background. "
"Do not change anything else.",
size="1024x1536",
quality="medium",
)
Mask-Based Inpainting
result = client.images.edit(
model="gpt-image-2",
image=open("photo.jpg", "rb"),
mask=open("sky_mask.png", "rb"), # transparent = regenerate
prompt="Replace the sky with dramatic aurora borealis, keep everything below the horizon identical",
size="1024x1024",
quality="high",
)
Batch Generation with Saving
import base64
from pathlib import Path
from openai import OpenAI
def generate_batch(prompt: str, n: int = 4, size: str = "1024x1024",
quality: str = "medium", out_prefix: str = "variant") -> list[Path]:
client = OpenAI()
result = client.images.generate(
model="gpt-image-2",
prompt=prompt,
size=size,
quality=quality,
n=n,
)
paths = []
for i, item in enumerate(result.data):
path = Path(f"{out_prefix}_{i}.png")
path.write_bytes(base64.b64decode(item.b64_json))
paths.append(path)
print(f"Saved {path}")
return paths
# Usage
variants = generate_batch(
prompt="product shot of a blue glass water bottle, white background, studio lighting",
n=4,
quality="low", # cheap sweep; rerun winner at high
)
Prompt Engineering Patterns
Structure template
[background/scene] → [subject] → [key details] → [constraints/intended use]
Research paper figure
gpt-image -p "Clean scientific diagram: transformer architecture overview. \
White background, labeled encoder/decoder blocks with arrows, \
color-coded attention heads in teal and orange, \
sans-serif labels, publication-ready, 4K resolution" \
--size landscape --quality high -f transformer-diagram.png
UI mockup
gpt-image -p "Mobile app UI mockup, iOS style, dark mode. \
Fitness tracking dashboard: circular progress ring in neon green, \
daily steps '8,432', heart rate '74 bpm', \
bottom nav with 4 icons, pixel-perfect, no lorem ipsum" \
--size portrait --quality high -f fitness-app.png
Typography poster
gpt-image -p "Event poster. Text: 'SUMMER SONIC 2026' in bold condensed sans-serif. \
Subtext: 'Tokyo · August 9–10'. Vivid sunset gradient background (magenta to amber). \
Geometric grid overlay, high contrast, print-ready" \
--size portrait --quality high -f poster.png
Photorealistic product shot
gpt-image -p "Photorealistic product photo: matte black insulated coffee thermos, \
condensation droplets, placed on dark slate surface, \
single soft key light from upper-left, shallow depth of field, \
shot on Canon 5D, 85mm lens, commercial quality" \
--size square --quality high -f thermos.png
Put required text in quotes
# Any text that must appear verbatim in the image — put in straight quotes in the prompt
prompt = '''Storefront sign reading "OPEN 24/7" in red neon.
Below it: "Est. 1987" in smaller white block letters.
Realistic neon glow, night scene, rain-slicked pavement.'''
Quality / Budget Strategy
| Stage | --quality |
When to use |
|---|---|---|
| Exploration sweep | low |
Generating 8–16 variants to find direction |
| Normal iteration | medium |
Style probing, layout checks |
| Final / shipping | high |
In-image text, dense diagrams, posters, paper figures |
Rule of thumb: start every new concept at low, run 4 variants, pick the best, then rerun at high.
# Step 1: cheap sweep
gpt-image -p "minimalist logo for a coffee brand" --quality low -n 4 -f logo.png
# Step 2: pick winner (e.g. logo_2.png), rerun at high
gpt-image -p "minimalist logo for a coffee brand" --quality high -f logo-final.png
Size Reference
| Alias | Pixels | Ratio | Best for |
|---|---|---|---|
square / 1k |
1024×1024 | 1:1 | Social posts, icons, product shots |
portrait |
1024×1536 | 2:3 | Mobile UI, posters, stories |
landscape |
1536×1024 | 3:2 | Web banners, diagrams |
wide |
1792×1024 | 7:4 | Cinematic, hero sections |
tall |
1024×1792 | 4:7 | Long-form mobile content |
2k |
2048×2048 | 1:1 | High-res assets |
Common Patterns & Recipes
Virtual try-on (multi-ref edit)
# image 1 = person, image 2 = garment
result = client.images.edit(
model="gpt-image-2",
image=[open("person.png", "rb"), open("shirt.png", "rb")],
prompt="Dress the person in image 1 wearing the shirt from image 2. "
"Keep the person's face, pose, and background identical. "
"Natural fabric draping and lighting.",
size="1024x1536",
quality="high",
)
Billboard / signage mockup
result = client.images.edit(
model="gpt-image-2",
image=open("billboard_photo.jpg", "rb"),
mask=open("billboard_mask.png", "rb"),
prompt='Replace the billboard face with: "SALE ENDS SUNDAY" '
'in bold white text on solid red background. '
'Match perspective and lighting of surrounding scene.',
size="1536x1024",
quality="high",
)
Anime / manga style transfer
gpt-image -p "Anime key visual style (Studio Ghibli-inspired): \
young woman standing on a hillside overlooking a coastal town at golden hour, \
painterly backgrounds, soft cel shading, \
detailed environmental storytelling, cinematic composition" \
--size landscape --quality high -f anime-scene.png
Translation / text replacement edit
# Replace text in an existing image in a different language
result = client.images.edit(
model="gpt-image-2",
image=open("menu_english.png", "rb"),
prompt='Replace all English text with Japanese translations. '
'Keep the exact same layout, fonts, colors, and imagery. '
'Translate "Grilled Salmon" → "グリルサーモン", '
'"Caesar Salad" → "シーザーサラダ".',
size="1024x1024",
quality="high",
)
Troubleshooting
OPENAI_API_KEY not found
export OPENAI_API_KEY="sk-..."
# or add to ~/.env — the CLI reads it automatically
Refusal / content policy error (exit code 1)
- Full API response is echoed to stderr
- Try rephrasing: be more descriptive and less ambiguous about intent
- Switch
--moderation auto→--moderation lowfor broader exploration (already the CLI default)
Text in image is garbled or wrong
- Always use
--quality highfor any prompt containing required text - Wrap required text in straight quotes inside the prompt string
- Keep required text short (under ~6 words per element)
Multi-ref edit ignores one image
- Be explicit: "the subject in image 1", "the object in image 2"
- Reduce to two input images; more than two can cause ambiguity
Size rejected by API
- Dimensions must be multiples of 16
- Max edge: 3840px
- Max aspect ratio: 3:1
- Total pixels: 655,360–8,294,400
gpt-image-2 rejects --input-fidelity
input-fidelityis agpt-image-1/1.5parameter; the CLI drops it automatically forgpt-image-2
Slow generation at high quality
Expected — high quality is significantly slower. Use low for drafts, high only for finals.
Prompt Gallery Categories
The skill ships 162 prompts split across category files under skills/gpt-image/references/:
gallery-research-paper-figures.md— diagrams, charts, architecture visualsgallery-ui-ux-mockups.md— mobile/web UI, dashboards, design systemsgallery-product-and-food.md— product shots, food styling, e-commercegallery-typography.md— posters, signage, letteringgallery-photography.md— portrait, landscape, macro, streetgallery-anime-manga.md— key visuals, character design, backgroundsgallery-maps.md— illustrated maps, infographic cartography- Start with
gallery.mdas a routing index to pick the right category file