skills/cdeistopened/skill-stack/image-prompt-generator

image-prompt-generator

SKILL.md

Image Prompt Generator

Generate professional, non-generic images using Google's Gemini API for image generation.

Prerequisites & Setup

Getting Your Gemini API Key

  1. Go to Google AI Studio
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the generated key

Configuring the API Key

Option 1: Environment file (recommended)

Create a .env file in your project root:

GEMINI_API_KEY=your_api_key_here

Option 2: Direct environment variable

export GEMINI_API_KEY=your_api_key_here

Install Dependencies

pip install google-generativeai python-dotenv pillow

Available Models

Model API Name Best For
Flash gemini-2.5-flash-image Speed, drafts, iteration
Pro gemini-3-pro-image-preview Final assets, 16:9 aspect ratio, quality

Image Size (CRITICAL for quality)

Size Output Use For
1K ~400KB Drafts, iteration
2K ~2MB Thumbnails, web assets (default)
4K ~7MB Print, high-resolution needs

CRITICAL: Always specify --size 2K or --size 4K for final assets. Without this, output defaults to 1K which looks low-quality.

Skill Stack Thumbnails - MANDATORY SETTINGS

For ANY Skill Stack thumbnail, you MUST use:

python scripts/images/generate_image.py "YOUR PROMPT" \
  --model pro \
  --size 2K \
  --aspect 16:9 \
  --output public/images/thumbnails \
  --name your-slug

And your prompt MUST include risograph style instructions:

STYLE: Risograph / screen print aesthetic. Visible halftone dots throughout. 
Slight misregistration between color layers. Indie printmaking quality - warm, tactile, handmade.

COLOR: Warm cream base. Charcoal and gray tones. Terracotta (#D4654A) accent on ONE key element.

TEXTURE: Visible halftone pattern. Paper grain. NOT smooth digital gradients.

AVOID: Empty white borders, photorealism, glossy renders, smooth gradients, clip art aesthetic.

WHY: Without --size 2K, output is ~400KB garbage that looks like it was made with Flash. Without risograph style, output looks generic and AI-generated.


Workflow Overview

  1. Brainstorm Concepts - Generate 4-6 high-level visual ideas
  2. Select Direction - User picks the concept they like
  3. Optimize Prompt - Refine into a strong, detailed prompt
  4. Style Variations - Adapt to 2-3 different visual styles
  5. Generate Images - Run via Gemini API

Step 1: Brainstorm Concepts

When the user provides a topic or use case, generate 3 high-level visual concepts and indicate your pick. Each concept should be:

  • One sentence describing the visual idea
  • Concrete and immediate - you can picture it instantly
  • Conceptual but not abstract - a clear object/scene with meaning
  • Non-generic - avoid cliches (no lightbulbs for ideas, no handshakes for partnership)

Format:

**Concept A:** One sentence description of the visual concept and why it works.

**Concept B:** One sentence description...

**Concept C:** One sentence description...

**My pick: Concept X** — Brief reasoning why this one is strongest.

Example for "newsletter about personal productivity":

**Concept A:** A vintage compass where the needle points toward a coffee ring stain on a map—direction emerges from daily rituals.

**Concept B:** A clock where the 12 hours show seasonal changes—time management over long arcs, not just hours.

**Concept C:** One small key attached to dozens of decorative keychains—we overcomplicate simple solutions.

**My pick: Concept A** — The coffee stain is unexpected and personal, the compass is concrete without being cliché.

Wait for user to confirm or redirect before proceeding.

Step 2: Optimize the Prompt

Once the user selects a concept, develop it into a full prompt. Structure:

Create a [style type] illustration of [subject].

CONCEPT: [Expand the one-sentence idea into a clear visual description]

STYLE: [Artistic approach - load from references/styles/ if brand-specific]

COMPOSITION: [Framing, focal point, negative space, balance]

COLORS: [Palette - describe by name, not hex codes which may render as text]

TEXTURE: [Surface qualities, analog/digital feel]

AVOID: [What should NOT appear - be specific]

FORMAT: [Aspect ratio]

Key principles:

  • Natural language, full sentences - no tag soup
  • Describe colors by name (burnt orange, sky blue, near-black) not hex codes
  • Maximum 2-3 elements - if it feels busy, remove something
  • Favor metaphor over literal depiction

Step 3: Style Variations

MANDATORY for Skill Stack: Risograph style. Every thumbnail must include risograph style instructions in the prompt. See references/styles/risograph.md.

Key risograph elements to ALWAYS include in prompt:

  • "Visible halftone dots throughout"
  • "Slight misregistration between color layers"
  • "Indie printmaking quality - warm, tactile, handmade"
  • "NOT smooth digital gradients"

Available styles in references/styles/ (for non-Skill-Stack projects):

  • risograph.md - DEFAULT. Halftone dots, misregistration, indie printmaking aesthetic.
  • minimalist-ink.md - High-contrast black and white, crosshatching.
  • watercolor-line.md - Ink linework with watercolor washes.
  • editorial-conceptual.md - Conceptual, sophisticated, editorial wit.

Step 4: Generate via API

Running the Script

# Generate 2K thumbnail (recommended for web)
python scripts/images/generate_image.py "prompt here" --model pro --aspect 16:9 --size 2K

# Generate 4K for print/high-res
python scripts/images/generate_image.py "prompt here" --model pro --size 4K

# Save to specific folder
python scripts/images/generate_image.py "prompt" --output "./images" --name "my_image" --size 2K

Options:

  • --model pro (default, higher quality) or --model flash (faster)
  • --size 2K (default), 1K, or 4K - always use 2K or 4K for final assets
  • --aspect 16:9 (default), 1:1, 9:16, 3:4, 4:3
  • --variations N - generate N versions
  • --output ./path - save location
  • --name prefix - filename prefix

Output location: Save images alongside the content they belong to - not a generic images dump.

Step 5: Iterate

After user reviews generated images:

  • 80% good? Request specific edits conversationally rather than regenerating
  • Composition off? Adjust framing or element placement in prompt
  • Wrong style? Try a different style reference
  • Too busy? Simplify to fewer elements
  • Colors wrong? Be more explicit about palette

Prompting Principles

Write Like a Creative Director

Brief the model like a human artist. Use proper grammar, full sentences, and descriptive adjectives.

Don't Do
"Cool car, neon, city, night, 8k" "A cinematic wide shot of a futuristic sports car speeding through a rainy Tokyo street at night. The neon signs reflect off the wet pavement and the car's metallic chassis."

Be specific about:

  • Subject: Instead of "a woman," say "a sophisticated elderly woman wearing a vintage chanel-style suit"
  • Materiality: Describe textures - "matte finish," "brushed steel," "soft velvet," "crumpled paper"
  • Setting: Define location, time of day, weather
  • Lighting: Specify mood and light source
  • Mood: Emotional tone of the image

Provide Context

Context helps the model make logical artistic decisions. Include the "why" or "for whom."

Example: "Create an image of a sandwich for a Brazilian high-end gourmet cookbook." (Model infers: professional plating, shallow depth of field, perfect lighting)

Keep It Simple

  • One clear focal point
  • Maximum 2-3 elements total
  • Generous negative space
  • If it feels busy, remove something

Avoid the Generic

Hard rules:

  • NEVER include text in images - Text renders poorly and looks amateur
  • No lightbulbs for "ideas"
  • No handshakes for "partnership"
  • No audio waveforms for "podcasts" or "audio"
  • No gears/cogs for "systems" or "process"
  • No happy stock photo poses
  • No glossy AI aesthetic
  • No puzzle pieces for "connection" or "integration"

Metaphors That Work

Strong concepts from past generations:

  • Orange cut open revealing neural network - organic exterior, systematic interior (knowledge systems, RAG)
  • Mail slot with origami bird emerging - delivery transforms into something alive (newsletters)
  • Exercise club crossed with quill pen - physical culture meets intellectual work (body + mind)
  • Typewriter key extreme close-up with radiating impact - decisive action, command (books, authority)
  • Medicine dropper with mandala bloom - clinical precision meets transcendence (therapy, transformation)
  • Straight razor on leather strop - precision tool, polishing, refinement (editing, production)
  • Prism dispersing light into distinct beams - one input, multiple outputs (frameworks)
  • Compass with coffee stain on map - direction from daily rituals (productivity)

Resources

references/styles/

Aesthetic style definitions:

  • risograph.md - DEFAULT - Halftone, misregistration, indie printmaking
  • minimalist-ink.md - Black and white ink illustration
  • watercolor-line.md - Ink with watercolor washes
  • editorial-conceptual.md - Conceptual editorial style

scripts/

  • generate_image.py - Gemini API image generation

Prompt Modifiers Reference

Category Examples
Lighting golden hour, dramatic shadows, soft diffused light, neon glow, overcast
Style cinematic, editorial, technical diagram, hand-drawn, photorealistic
Texture matte finish, brushed steel, soft velvet, crumpled paper, weathered wood
Composition wide shot, close-up, bird's eye view, dutch angle, symmetrical
Mood energetic, serene, dramatic, playful, sophisticated
Quality 4K, high-fidelity, pixel-perfect, professional grade

Advanced Capabilities

Text Rendering & Infographics

Put exact text in quotes. Specify style: "polished editorial," "technical diagram," or "hand-drawn whiteboard."

Example prompts:

Earnings Report Infographic:
"Generate a clean, modern infographic summarizing the key financial highlights from this earnings report. Include charts for 'Revenue Growth' and 'Net Income', and highlight the CEO's key quote in a stylized pull-quote box."
Whiteboard Summary:
"Summarize the concept of 'Transformer Neural Network Architecture' as a hand-drawn whiteboard diagram suitable for a university lecture. Use different colored markers for the Encoder and Decoder blocks, and include legible labels for 'Self-Attention' and 'Feed Forward'."

Character Consistency & Thumbnails

Use reference images and state "Keep the person's facial features exactly the same as Image 1." Describe expression/action changes while maintaining identity.

Example prompt:

Viral Thumbnail:
"Design a viral video thumbnail using the person from Image 1.
Face Consistency: Keep the person's facial features exactly the same as Image 1, but change their expression to look excited and surprised.
Action: Pose the person on the left side, pointing their finger towards the right side of the frame.
Subject: On the right side, place a high-quality image of a delicious avocado toast.
Graphics: Add a bold yellow arrow connecting the person's finger to the toast.
Text: Overlay massive, pop-style text in the middle: 'Done in 3 mins!'. Use a thick white outline and drop shadow.
Background: A blurred, bright kitchen background. High saturation and contrast."

Image Reworking (Edit Existing Images)

The --input flag enables "rework mode" - pass an existing image to Gemini and describe the changes you want.

Key use cases:

  • Small tweaks - Adjust colors, add/remove elements, change lighting
  • Style transfer - Keep composition but change artistic style
  • Object manipulation - Remove, add, or modify specific objects
  • Seasonal/temporal changes - Same scene, different time/season

Running in rework mode:

# Basic edit - add something
python scripts/generate_image.py "Add snow to the roof and yard" \
  --input ./house.png \
  --model pro

# Color adjustment
python scripts/generate_image.py "Change the accent color from red to teal, keep everything else identical" \
  --input ./thumbnail.png \
  --model pro

# Style transfer - keep composition, change aesthetic
python scripts/generate_image.py "Convert this to risograph style with halftone dots and slight color misregistration" \
  --input ./photo.png \
  --model pro

# Generate variations of an edit
python scripts/generate_image.py "Make the lighting warmer, like golden hour" \
  --input ./portrait.png \
  --variations 3 \
  --model pro

Prompting tips for rework mode:

  1. Be specific about what to preserve:

    • "Keep the person's facial features exactly the same"
    • "Maintain the composition and framing"
    • "Don't change the background"
  2. Be explicit about what to change:

    • "Change ONLY the color of the shirt from blue to red"
    • "Add snow to the roof and nothing else"
    • "Remove the text overlay"
  3. Use comparative language:

    • "Make the colors more vibrant"
    • "Increase the contrast slightly"
    • "Make the lighting softer and more diffused"

Output naming: Files from rework mode are named {prefix}_{timestamp}_edit_{model}.png to distinguish from generated images (_gen_).

Advanced Editing Examples

Object Removal:

python scripts/generate_image.py \
  "Remove the tourists from the background and fill with matching cobblestones and storefronts" \
  --input ./street-photo.png \
  --model pro

Seasonal Control:

python scripts/generate_image.py \
  "Turn this into winter. Add snow to the roof and yard. Change lighting to cold, overcast afternoon. Keep architecture identical." \
  --input ./house-summer.png \
  --model pro

Character Consistency (thumbnail series):

python scripts/generate_image.py \
  "Keep the person's face exactly the same. Change expression to surprised. Add a pointing gesture toward the right side of the frame." \
  --input ./person-reference.png \
  --model pro

Related Skills

  • youtube-title-creator - Pair generated images with optimized titles
  • social-content-creation - Use images in platform-optimized posts

For custom brand styles, create new style files in references/styles/ following the existing format

Weekly Installs
13
GitHub Stars
4
First Seen
Jan 27, 2026
Installed on
opencode12
github-copilot12
codex12
gemini-cli12
kimi-cli11
amp11