ASO Screenshot Generator — iOS & Android

Generate 6-8 polished, high-converting App Store and Google Play screenshots with device frames, benefit-driven headlines, and multi-language support. Produces pixel-perfect images for both platforms simultaneously.

Output dimensions:

iOS (Apple App Store): 1320 × 2868 px (iPhone 6.9" / 16 Pro Max)
Android (Google Play): 1080 × 1920 px (standard phone)

Phase 1 — Configuration & Recall

Version check

Before anything else, check if a newer version of this skill is available:

python3 update.py --check

If a new version is available, inform the user and suggest updating:

python3 update.py --update

Check for saved progress

Check Claude Code memory for any saved progress on this project:

Look for memory entries related to "aso-screenshots", "store-screenshots", or the project name
If found, present status of completed phases and ask whether to resume or start fresh

Gather inputs from user

Ask the user for the following configuration. Present as a clean checklist:

Target platforms — Which platforms? (default: phone only — iOS and Android)
- iOS phone (Apple App Store) — 1320×2868
- Android phone (Google Play Store) — 1080×1920
- iPad Pro 13" (App Store Connect) — 2064×2752
- Android tablet (Google Play Large Screen) — 1600×2560
- Phone only (iOS + Android)
- All platforms (phone + tablet, both iOS and Android)
Number of screenshots — How many per platform? (default: 6)
- 6 screenshots (recommended minimum)
- 8 screenshots (maximum coverage)
Language — What language for headline text? (default: English)
- Recommended: Generate in English first, then use Gemini to translate to additional languages (see Phase 6)
- Or specify a single language directly (English, Turkish, Spanish, German, Japanese, etc.)
- English-first approach is faster and produces more consistent results across languages
Screenshot source — Where are the app screenshots coming from?
- Auto-capture: I'll guide you through taking simulator screenshots
- Manual input: User provides screenshot image files directly (PNG/JPG paths)
Brand colour — Background hex colour (e.g., #E31837)
- If not provided, attempt to auto-detect from codebase (look for theme colors, brand constants, primary colors in the project)
- If no codebase available, ask the user
Gemini AI Model (Required) — All screenshots are enhanced with Gemini AI
- Model: Which Gemini model to use (must pick one)
  - nano-banana-pro — Higher quality, slower (recommended for final output)
  - nano-banana-2 — Faster, good quality (recommended for iteration)
- API key: Gemini API key for authentication (required)
  - Can be provided directly, via GEMINI_API_KEY environment variable, or saved to .gemini_config.json
  - To save the key for future use: python3 gemini_enhance.py --save-key "YOUR_KEY"
- IMPORTANT: The workflow will not proceed without a valid Gemini API key and model selection

Save all configuration to memory:

memory_key: aso-screenshots-config
data: { platforms, count, language, source_mode, brand_colour, project_name, gemini_model, gemini_api_key_configured }

Phase 2 — Benefit Discovery

Identify the app's core value propositions that will become screenshot headlines.

If codebase is available:

Analyze the app's codebase to identify 6-8 core benefits (matching requested screenshot count)
Focus on features that are visually demonstrable in screenshots
IMPORTANT: Skip any paywall, subscription, or in-app purchase screens — these must NOT appear in store screenshots

If no codebase (manual mode):

Ask the user to describe their app's key features and benefits
Collaboratively refine the list

Headline format

Each benefit needs two parts:

Verb — A bold action word (1-2 words max). Examples: TRACK, DISCOVER, ORGANIZE, CONNECT, CREATE, MONITOR, PLAN, SHARE
Descriptor — What the user achieves (2-5 words). Examples: YOUR DAILY MOOD, NEW CONNECTIONS, WEEKLY GOALS

Language adaptation: If the configured language is not English, translate both verb and descriptor naturally. Don't do literal translations — use idiomatic expressions that sound compelling in the target language.

CRITICAL — Language-specific characters: When using non-English languages, pay strict attention to special characters and their correct uppercase/lowercase forms. Headlines are rendered in ALL CAPS, so correct uppercasing is essential:

Turkish: İ/i (not I/i), Ç/ç, Ö/ö, Ü/ü, Ğ/ğ, Ş/ş — e.g., "İZLE" not "IZLE", "GÜÇLÜ" not "GUCLU"
German: Ä/ä, Ö/ö, Ü/ü, ß/SS — e.g., "ÜBERBLICK" not "UBERBLICK"
Spanish/French: accented characters (É, Ñ, À, etc.) must be preserved in uppercase
All languages: Never strip, replace, or ASCII-ify special characters. They are required for correct spelling and readability.

Present to user

Show the complete benefit list as a numbered table:

#	Verb	Descriptor	Combined Headline
1	TRACK	YOUR DAILY MOOD	TRACK YOUR DAILY MOOD
2	...	...	...

Ask user to approve, modify, or reorder. Iterate until approved.

Save to memory:

memory_key: aso-screenshots-benefits
data: { benefits: [{ verb, descriptor, order }], language }

Phase 3 — Screenshot Pairing

If Auto-capture mode:

Guide user through taking simulator screenshots for each benefit
For each screenshot provided, assess quality honestly:
- Great — Perfect, ready to use
- Usable — Minor issues but workable
- Retake — Recommend retaking (explain why)
Match screenshots to benefits based on relevance and visual impact

If Manual input mode:

Ask user to provide paths to their screenshot files
Accept any of: absolute paths, relative paths, glob patterns
Display thumbnails/filenames and let user assign each to a benefit
If fewer screenshots than benefits, ask user to provide more or reduce benefit count

Screenshot rules

NO paywall screenshots — Skip any screen showing subscription plans, pricing, or IAP items
NO purchase confirmation screens — Skip any checkout or payment success screens
Prefer screens that show the app's core functionality in action
Prefer screens with visible content/data rather than empty states

Quality check

For each screenshot, verify:

Resolution is sufficient (warn if below 1320px wide for iOS or 1080px for Android)
Content is appropriate (no placeholder/debug data)
Screen shows a compelling state of the feature

Save pairings to memory:

memory_key: aso-screenshots-pairings
data: { pairings: [{ benefit_index, screenshot_path, assessment }] }

Phase 4 — Generation

Prerequisites

Ensure Python dependencies are available:

pip install Pillow

Generate device frames (if not already present)

python generate_frame.py

This creates:

assets/iphone_frame.png (iPhone 16 Pro Max)
assets/android_frame.png (Pixel-style phone)
assets/ipad_frame.png (iPad Pro 13")
assets/android_tablet_frame.png (Android 10" tablet)

Generate screenshots

Output filenames follow the format {LANG_CODE}_{SS_NUMBER} (e.g., en_01.png, en_02.png).

CRITICAL — pairing rule: Screenshot number N (filename en_0N.png) ALWAYS uses the Nth pairing (1st pairing for en_01, 2nd pairing for en_02, etc.). There is NO "cover" screenshot without a headline — every single compose.py call must have --verb and --desc from a real pairing. If you pass empty --verb or --desc, compose.py will error.

Concretely, for N screenshots, run EXACTLY N compose.py calls:

en_01.png → pairing 1 (first benefit, first screenshot)
en_02.png → pairing 2 (second benefit, second screenshot)
en_03.png → pairing 3
… and so on through en_0N.png → pairing N

# Screenshot 1 — first pairing
python compose.py \
  --platform ios \
  --bg "{brand_colour}" \
  --verb "{verb_1}" \
  --desc "{descriptor_1}" \
  --screenshot "{screenshot_1}" \
  --output "output/en/ios/en_01.png"

# Screenshot 2 — second pairing
python compose.py \
  --platform ios \
  --bg "{brand_colour}" \
  --verb "{verb_2}" \
  --desc "{descriptor_2}" \
  --screenshot "{screenshot_2}" \
  --output "output/en/ios/en_02.png"

# … continue through all N pairings in order …

# Screenshot N — Nth pairing
python compose.py \
  --platform ios \
  --bg "{brand_colour}" \
  --verb "{verb_N}" \
  --desc "{descriptor_N}" \
  --screenshot "{screenshot_N}" \
  --output "output/en/ios/en_0N.png"

Run the same set of commands for each additional platform (android, ipad, android_tablet) using the same verbs/descs and the appropriate screenshot paths.

Note on tablet screenshots: Tablet app screenshots should ideally be taken from the iPad/tablet simulator. If the user only has phone screenshots, they can still be used — compose.py will scale them to fit the device screen area.

Output directory structure

output/
  en/                          ← default language (English)
    ios/
      en_01.png  (1320×2868)
      en_02.png
      ...
    android/
      en_01.png  (1080×1920)
      en_02.png
      ...
    ipad/                      ← only if tablet selected
      en_01.png  (2064×2752)
      en_02.png
      ...
    android_tablet/            ← only if tablet selected
      en_01.png  (1600×2560)
      en_02.png
      ...

Composition review — REQUIRED before enhancement

STOP HERE. Do not proceed to Gemini enhancement until the user has reviewed and approved.

After all compose.py commands have run, open each composed screenshot and present them to the user:

Display every composed output file inline so the user can see them
Run a quick automated check on each:
- Text is readable and properly centered
- Device frame is correctly positioned
- Screenshot content is visible and well-framed
- No clipping of important content
Present a clear summary table:

#	File	Text	Frame	Content	Status
1	en_01.png	✓	✓	✓	Ready
2	en_02.png	✓	✓	✓	Ready
…	…	…	…	…	…

Ask the user explicitly:

"These are your composed screenshots before AI enhancement. Everything look good? Any text, layout, or content issues to fix before I send them to Gemini?"
Wait for approval. If the user requests changes:
- Adjust the relevant compose.py command (verb, desc, bg colour, screenshot path)
- Regenerate only the affected screenshots
- Show the updated files and ask again
- Repeat until the user confirms all screenshots are approved
Only after explicit approval, proceed to AI enhancement below.

Why this step matters: Gemini enhancement is irreversible and consumes API quota. Catching layout or text issues at the compose stage (free, instant) is always better than discovering them after enhancement.

Consistency

The first approved screenshot sets the visual tone for the set
All screenshots must use the same brand colour, font sizing pattern, and layout style
Maintain visual rhythm across the set — they should look like a cohesive collection

AI Enhancement with Gemini (Required — only after user approval above)

After the user has approved all composed screenshots, run gemini_enhance.py to produce the final polished output. This is the critical step that transforms plain mockups into premium, high-converting visuals.

# Enhance a single screenshot with app context
python3 gemini_enhance.py \
  --input "output/en/ios/en_{n}.png" \
  --output "output/en/ios/en_{n}.png" \
  --model "{gemini_model}" \
  --app-desc "{short app description}" \
  --bg-color "{brand_colour}" \
  --lang-code "en" \
  --index {n}

# Batch enhance an entire directory (with --lang-code for proper naming)
python3 gemini_enhance.py \
  --input-dir output/en/ios/ \
  --output-dir output/en/ios/ \
  --model "{gemini_model}" \
  --app-desc "{short app description}" \
  --bg-color "{brand_colour}" \
  --lang-code "en"

# Tablet platforms use the same command with different directories
python3 gemini_enhance.py \
  --input-dir output/en/ipad/ \
  --output-dir output/en/ipad/ \
  --model "{gemini_model}" \
  --app-desc "{short app description}" \
  --bg-color "{brand_colour}" \
  --lang-code "en"

Available models:

nano-banana-pro — Higher quality enhancement (recommended for final output)
nano-banana-2 — Faster enhancement, good quality (recommended for iteration)

What Gemini enhancement does:

Text readability: Ensures headline text has maximum contrast (white on dark, dark on light) with glows and shadows
Floating decorative elements: Adds app-related icons, abstract shapes, particles, and sparkles around the device
3D perspective: Alternates between straight-on and tilted device angles for visual variety
Rich backgrounds: Enhances flat backgrounds with gradients, organic shapes, ambient lighting, and depth
Breakout effects: Elements that extend beyond the device frame for dynamic compositions
Professional polish: Shadows, reflections, lens flares, and lighting that make each screenshot feel designer-crafted
Always enforces exact store-required dimensions after AI processing

The script alternates between two built-in prompt styles (even/odd screenshots) to create visual variety across the set while maintaining consistency.

API key resolution order:

--api-key CLI argument
GEMINI_API_KEY environment variable
.gemini_config.json file (saved via --save-key)

Save generation state to memory:

memory_key: aso-screenshots-generation
data: { generated: [{ benefit_index, platform, output_path, status }] }

Phase 5 — Showcase & Delivery

Generate showcase preview

python showcase.py \
  --ios output/en/ios/en_01.png output/en/ios/en_02.png output/en/ios/en_03.png \
  --android output/en/android/en_01.png output/en/android/en_02.png output/en/android/en_03.png \
  --output showcase.png

Present results

Show the showcase image to the user
List all generated files with their dimensions
Provide upload instructions:

For Apple App Store Connect:

Navigate to your app > App Store tab > Screenshots section
Upload each iOS screenshot in order (screenshot_1 through screenshot_N)
Select "iPhone 6.9-inch Display" as the device type

For Google Play Console:

Navigate to your app > Store listing > Phone screenshots
Upload each Android screenshot in order
Ensure "Phone" device type is selected

Final checklist

All screenshots are the correct dimensions
Headlines are in the correct language
No paywall or IAP screens included
Screenshots show compelling app states
Visual style is consistent across all screenshots
Both platforms have matching content (same benefits, same order)

Phase 6 — Multi-Language Translation (Optional)

Use Gemini to generate translated versions of your English screenshots. This is faster than re-running the full pipeline and works for any number of languages.

Prerequisites

English screenshots must already be generated and enhanced (Phase 4 output)
Same Gemini model and API key from Phase 4

Pre-translation questions — ask before every language

Before running any translate command, ask the user these two questions:

Question 1 — Style matching:

"Should the translated screenshots keep the same visual style as the English originals (same background colour, gradients, decorative elements)? [Y/n, default Yes]"

Yes (default) → omit --no-style-match (style is preserved)
No → add --no-style-match (Gemini may adapt gradients and decorative elements for the locale)

Question 2 — Cultural elements:

"Should culturally specific visual elements be added for this language (e.g., Ottoman tiles for Turkish, cherry blossoms for Japanese)? [y/N, default No]"

No (default) → omit --cultural-elements
Yes → add --cultural-elements (Gemini weaves culturally resonant motifs into decorative areas)

Save answers to memory and apply them to every gemini_enhance.py translate call for this language run.

Translate a single language

Always pass --lang-code so output files are named correctly (tr_01.png, tr_02.png, etc.). Add --no-style-match and/or --cultural-elements based on the user's answers above.

# Translate all iOS phone screenshots to Turkish
# (example with style preserved, no cultural elements — the defaults)
python3 gemini_enhance.py \
  --input-dir output/en/ios/ \
  --output-dir output/tr/ios/ \
  --model "{gemini_model}" \
  --translate-to "Turkish" \
  --lang-code "tr"

# Same, but with style adaptation and cultural elements enabled
python3 gemini_enhance.py \
  --input-dir output/en/ios/ \
  --output-dir output/tr/ios/ \
  --model "{gemini_model}" \
  --translate-to "Turkish" \
  --lang-code "tr" \
  --no-style-match \
  --cultural-elements

# Translate all Android phone screenshots to German
# (append --no-style-match and/or --cultural-elements per user answers)
python3 gemini_enhance.py \
  --input-dir output/en/android/ \
  --output-dir output/de/android/ \
  --model "{gemini_model}" \
  --translate-to "German" \
  --lang-code "de"

# Translate iPad screenshots (same pattern, same optional flags)
python3 gemini_enhance.py \
  --input-dir output/en/ipad/ \
  --output-dir output/tr/ipad/ \
  --model "{gemini_model}" \
  --translate-to "Turkish" \
  --lang-code "tr"

# Translate Android tablet screenshots
python3 gemini_enhance.py \
  --input-dir output/en/android_tablet/ \
  --output-dir output/tr/android_tablet/ \
  --model "{gemini_model}" \
  --translate-to "Turkish" \
  --lang-code "tr"

Translate a single screenshot

python3 gemini_enhance.py \
  --input output/en/ios/en_01.png \
  --output output/tr/ios/tr_01.png \
  --model "{gemini_model}" \
  --translate-to "Turkish" \
  --lang-code "tr"
  # add --no-style-match and/or --cultural-elements if the user chose those options

Batch translate multiple languages

Run once per target language. Always source from output/en/. Output structure:

output/
  en/                          ← default English (source for all translations)
    ios/             (en_01.png … en_0N.png)
    android/
    ipad/
    android_tablet/
  tr/                          ← Turkish
    ios/             (tr_01.png … tr_0N.png)
    android/
    ipad/
    android_tablet/
  de/                          ← German
    ios/
    android/
    ...
  ja/                          ← Japanese
    ios/
    android/
    ...

Supported languages

Any language Gemini understands. Use full names for best results:

European: Turkish, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Ukrainian
Asian: Japanese, Korean, Simplified Chinese, Traditional Chinese, Hindi, Thai, Indonesian
Others: Arabic, Russian, etc.

Cultural touches

When --lang-code is provided, Gemini automatically integrates culturally resonant visual elements into the screenshot:

Code	Cultural aesthetic added
`tr`	Ottoman geometric tiles, tulip motifs, terracotta & turquoise
`de`	Bauhaus geometry, cool steel blues, structured minimalism
`fr`	Art Nouveau curves, gold accents, Parisian elegance
`ja`	Cherry blossoms, wave motifs, generous negative space
`ko`	Hanji paper texture, bojagi color geometry
`zh`	Cloud ruyi patterns, bamboo, red & gold accents
`ar`	Arabesque geometry, crescent motifs, lapis blue & gold
`ru`	Constructivist geometry, deep reds & golds
`es`	Moorish tile hints, sienna tones, Mediterranean vitality
`it`	Renaissance composition, terracotta & cobalt
`pt`	Azulejo tile patterns, deep indigo blues
`hi`	Mandala patterns, saffron & marigold, Mughal geometry
`ja`	Cherry blossom, wabi-sabi, ink-wash texture
`uk`	Petrykivka folk florals, sunflowers, blue & gold

These are woven into decorative elements and backgrounds — never stereotypical, always tasteful.

Quality notes

Idiomatic translation: Gemini uses natural, compelling language — not literal word-for-word translation
Special characters: The prompt enforces correct character preservation (İ, Ğ, Ü, Ä, etc.)
Cultural consistency: Each language version has its own cultural visual identity while maintaining brand colour
Review recommended: Spot-check a few screenshots per language, especially for languages with non-Latin scripts

Final checklist for translated sets

Headline text is in the correct language and reads naturally
Special characters are correct (no ASCII substitutions)
Text contrast and readability is preserved
Device frame and app content are unchanged
All screenshots per language are the correct platform dimensions

Design Guidelines

Text Safety

Headlines must stay within the center 70% of canvas width
Text should never touch or overlap the device frame
Minimum 40px gap between text bottom and device top

Device Placement

Device frames are positioned in the upper-middle area of the canvas
The bottom of the device intentionally extends beyond/crops at canvas edge
This creates a professional, dynamic composition

Colour & Contrast

Background colour should be bold and brand-aligned
Text automatically adjusts to white or dark based on background luminance
Subtle gradient overlay adds depth (can be disabled with --no-gradient)

Typography

Action verbs: Maximum impact, largest possible size (150-260pt for iOS, 120-200pt for Android)
Descriptors: Supporting text, slightly smaller (128pt iOS, 96pt Android)
All text is uppercase for maximum store shelf impact
Font: SF Pro Display Black (macOS) with fallbacks for other systems

What Makes High-CTR Screenshots

Benefit-first messaging — Lead with what the user gets, not feature names
Action verbs — Create urgency and engagement (TRACK, DISCOVER, BUILD)
Visual hierarchy — Eye flows: headline -> device -> app content
Consistent branding — Same colour, same style, professional set
Real app content — Show the app in action with realistic data, not empty states
No clutter — Clean backgrounds, focused messaging, one benefit per screenshot

aso-cosmicmeta-ss

ASO Screenshot Generator — iOS & Android

Phase 1 — Configuration & Recall

Version check

Check for saved progress

Gather inputs from user

Phase 2 — Benefit Discovery

If codebase is available:

If no codebase (manual mode):

Headline format

Present to user

Phase 3 — Screenshot Pairing

If Auto-capture mode:

If Manual input mode:

Screenshot rules

Quality check

Phase 4 — Generation

Prerequisites

Generate device frames (if not already present)

Generate screenshots

Output directory structure

Composition review — REQUIRED before enhancement

Consistency

AI Enhancement with Gemini (Required — only after user approval above)

Phase 5 — Showcase & Delivery

Generate showcase preview

Present results

Final checklist

Phase 6 — Multi-Language Translation (Optional)

Prerequisites

Pre-translation questions — ask before every language

Translate a single language

Translate a single screenshot

Batch translate multiple languages

Supported languages

Cultural touches

Quality notes

Final checklist for translated sets

Design Guidelines

Text Safety

Device Placement

Colour & Contrast

Typography

What Makes High-CTR Screenshots