nano-banana-pro
Nano Banana Pro Image Generation, Editing & Prompting
Generate new images, edit existing ones, or craft precise prompts for Google's Nano Banana Pro API (Gemini 3 Pro Image), including style transfer, color-grade transfer, locked-variable edits, text-heavy layouts, and subject replacement.
Usage
Run the script using absolute path (do NOT cd to skill directory first):
Generate new image:
uv run ~/skills/nano-banana-pro/scripts/generate_image.py --prompt "your image description" --filename "output-name.png" [--resolution 1K|2K|4K] [--api-key KEY]
Edit existing image:
uv run ~/skills/nano-banana-pro/scripts/generate_image.py --prompt "editing instructions" --filename "output-name.png" --input-image "path/to/input.png" [--resolution 1K|2K|4K] [--api-key KEY]
Important: Always run from the user's current working directory so images are saved where the user is working, not in the skill directory.
Default Workflow (draft → iterate → final)
Goal: fast iteration without burning time on 4K until the prompt is correct.
- Draft (1K): quick feedback loop
uv run ~/skills/nano-banana-pro/scripts/generate_image.py --prompt "<draft prompt>" --filename "yyyy-mm-dd-hh-mm-ss-draft.png" --resolution 1K
- Iterate: adjust prompt in small diffs; keep filename new per run
- If editing: keep the same
--input-imagefor every iteration until you’re happy.
- If editing: keep the same
- Final (4K): only when prompt is locked
uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "<final prompt>" --filename "yyyy-mm-dd-hh-mm-ss-final.png" --resolution 4K
Resolution Options
The Gemini 3 Pro Image API supports three resolutions (uppercase K required):
- 1K (default) - ~1024px resolution
- 2K - ~2048px resolution
- 4K - ~4096px resolution
Map user requests to API parameters:
- No mention of resolution →
1K - "low resolution", "1080", "1080p", "1K" →
1K - "2K", "2048", "normal", "medium resolution" →
2K - "high resolution", "high-res", "hi-res", "4K", "ultra" →
4K
API Key
The script checks for API key in this order:
--api-keyargument (use if user provided key in chat)GEMINI_API_KEYenvironment variable
If neither is available, the script exits with an error message.
Preflight + Common Failures (fast fixes)
-
Preflight:
command -v uv(must exist)test -n \"$GEMINI_API_KEY\"(or pass--api-key)- If editing:
test -f \"path/to/input.png\"
-
Common failures:
Error: No API key provided.→ setGEMINI_API_KEYor pass--api-keyError loading input image:→ wrong path / unreadable file; verify--input-imagepoints to a real image- “quota/permission/403” style API errors → wrong key, no access, or quota exceeded; try a different key/account
Filename Generation
Generate filenames with the pattern: yyyy-mm-dd-hh-mm-ss-name.png
Format: {timestamp}-{descriptive-name}.png
- Timestamp: Current date/time in format
yyyy-mm-dd-hh-mm-ss(24-hour format) - Name: Descriptive lowercase text with hyphens
- Keep the descriptive part concise (1-5 words typically)
- Use context from user's prompt or conversation
- If unclear, use random identifier (e.g.,
x9k2,a7b3)
Examples:
- Prompt "A serene Japanese garden" →
2025-11-23-14-23-05-japanese-garden.png - Prompt "sunset over mountains" →
2025-11-23-15-30-12-sunset-mountains.png - Prompt "create an image of a robot" →
2025-11-23-16-45-33-robot.png - Unclear context →
2025-11-23-17-12-48-x9k2.png
Image Editing
When the user wants to modify an existing image:
- Check if they provide an image path or reference an image in the current directory
- Use
--input-imageparameter with the path to the image - The prompt should contain editing instructions (e.g., "make the sky more dramatic", "remove the person", "change to cartoon style")
- Common editing tasks: add/remove elements, change style, adjust colors, blur background, etc.
Prompt Handling
For simple generation: Pass the user's image description as-is to --prompt. Only rework if clearly insufficient.
For editing: Pass editing instructions in --prompt (e.g., "add a rainbow in the sky", "make it look like a watercolor painting").
For prompt-writing requests: Do not run generation unless the user asks for an image. Deliver a ready-to-run prompt as the primary output.
For complex prompts: Use labeled sections and explicit rules when the task includes text, counts, layouts, typography, multiple inputs, diagrams, or exact constraints.
Preserve user's creative intent in both cases.
Prompt Engineering Workflow
Use this when the user asks for a Nano Banana prompt, wants prompt optimization, or the generation requires precise structure.
- Clarify goal and constraints from the request: subject, medium, layout, text, counts, input images, output format.
- Choose the smallest pattern that enforces the constraints.
- Draft the prompt with labeled sections and explicit rules.
- Add validation anchors for exact counts, placement, readable text, and forbidden changes.
- Provide 1-2 variants only if useful or requested.
Prompt Skeleton
GOAL: <what to generate>
INPUTS: <image refs or none>
LAYOUT: <spatial regions, hierarchy, placement>
SUBJECTS: <entities, counts, poses>
TEXT: <exact strings, fonts, placement>
STYLE: <medium, rendering, aesthetic>
LIGHTING/CAMERA: <angle, lens, lighting>
CONSTRAINTS: <must/never rules, exact counts>
VALIDATION: <"ensure exactly X", "no extra text">
Prompt Pattern Selection
- Layout-heavy, text-heavy, or multi-panel: use structured layout anchors.
- Multi-input synthesis or system-level control: use JSON prompt.
- Cinematic scene with people/props: use scene composition.
- Character identity transfer: use character pipeline.
When to Load References
references/prompt-patterns.md— detailed templates, checklists, and pattern selection.references/prompt-bank.md— full example prompts and use-case blueprints.
Prompt Templates (high hit-rate)
Use templates when the user is vague or when edits must be precise.
-
Generation template:
- “Create an image of: . Style: . Composition: <camera/shot>. Lighting: . Background: . Color palette: . Avoid: .”
-
Editing template (preserve everything else):
- “Change ONLY: . Keep identical: subject, composition/crop, pose, lighting, color palette, background, text, and overall style. Do not add new objects. If text exists, keep it unchanged.”
Motion-Ready Start Frames
Use this when the still image is meant to become the starting frame for an animated AI video shot.
Rule
Treat the frame like a keyframe pulled from motion, not like a neutral photo.
What to emphasize
- implied movement already underway
- active posture and directional energy
- asymmetry, momentum, and continuation-ready gesture
- hair/fabric/environmental motion where appropriate
Helpful motion language
kinetic, dynamic, mid-motion, caught in movement, in-action, directional energy, wind-swept, hair in motion, fabric in motion
Avoid
Static portrait energy when the next step is animation.
Advanced Edit Patterns
Use these when the job is tighter than a generic edit.
Locked-Variables Edit Pattern
Use this when only one element should change and everything else must stay fixed.
Change ONLY: [single variable].
Keep locked: subject identity, pose, framing/crop, camera angle/lens feel, lighting direction, color grade, background, wardrobe, and overall style.
Do not change facial structure, expression, proportions, or any unmentioned element.
Color-Grade Transfer Pattern
Use this when the target shot should keep its composition but inherit the look of a reference image.
Transfer the exact color grade and tonal treatment from the reference style image onto the target image. Keep the target image's composition, framing, focus, lighting direction, pose, depth of field, and camera perspective exactly the same. Do not alter subject position or lens behavior. Apply only the tonal palette, contrast behavior, highlight rolloff, shadow density, and overall cinematic color treatment from the style reference.
Subject Replacement Pattern
Use this when the composition/style should stay the same but the main person/object must be swapped.
Replace the original subject with the person from the reference images, seamlessly integrated into the exact same pose, body positioning, framing, camera angle, and environment. Preserve natural biomechanics, perspective, scale, and shadow consistency. Match the original lighting conditions exactly. Maintain realistic skin texture, visible pores, fine facial detail, and natural tonal variation. No smoothing, no distortion, no artificial blending artifacts.
Upscale / Enhancement Templates
Use these when the user wants to enhance an existing image to cinematic quality without changing its content. Always use with --input-image. Resolution should be 4K for final upscales, 2K for draft review.
Portrait / Person Upscale
Enhance the uploaded image to a flawless, ultra-high-quality cinematic version while preserving the subject with absolute precision. The person's identity, facial anatomy, expression, body pose, clothing, accessories, surroundings, framing, and overall composition must remain completely unchanged. Do not alter, reinterpret, replace, or introduce any new visual elements. Restore and refine micro-level details including precise facial contours, authentic skin texture with naturally visible pores, individually rendered hair strands, sharp and vivid eyes, and clean, well-defined edges throughout the image. Enhance dynamic range, contrast, and dimensionality with balanced, studio-quality cinematic lighting.
Automotive / Car Upscale
Enhance the uploaded image to a flawless, ultra-high-quality cinematic version while preserving the vehicle with absolute precision. The car's make, model, body shape, paint color, finish type, livery, badges, wheels, aerodynamic elements, surroundings, framing, and overall composition must remain completely unchanged. Do not alter, reinterpret, replace, or introduce any new visual elements. Restore and refine micro-level details including precise panel contours, authentic paint surface with visible metallic flake or matte grain, individually resolved mesh and spoke geometry in the wheels, sharp and legible badges and emblems, clean brake caliper detail, and well-defined edges throughout the image. Enhance dynamic range, contrast, and dimensionality with balanced, cinematic lighting that preserves the original light direction and color temperature.
Universal Upscale (any image)
Enhance the uploaded image to a flawless, ultra-high-quality cinematic version while preserving every subject and element with absolute precision. All subjects, objects, materials, colors, textures, surroundings, framing, and overall composition must remain completely unchanged. Do not alter, reinterpret, replace, or introduce any new visual elements. Restore and refine micro-level details including precise contours, authentic surface textures, individually resolved fine structures, sharp focal elements, and clean, well-defined edges throughout the image. Enhance dynamic range, contrast, and dimensionality with balanced, cinematic lighting that preserves the original light direction and color temperature.
Choosing the right template
| Image contains | Use |
|---|---|
| People, faces, portraits | Portrait / Person |
| Cars, vehicles, automotive | Automotive / Car |
| Anything else or mixed subjects | Universal |
Output
- Saves PNG to current directory (or specified path if filename includes directory)
- Script outputs the full path to the generated image
- Do not read the image back - just inform the user of the saved path
Examples
Generate new image:
uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "A serene Japanese garden with cherry blossoms" --filename "2025-11-23-14-23-05-japanese-garden.png" --resolution 4K
Edit existing image:
uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "make the sky more dramatic with storm clouds" --filename "2025-11-23-14-25-30-dramatic-sky.png" --input-image "original-photo.jpg" --resolution 2K
More from michailbul/laniameda-skills
notion-sync
>
14instagram-extract
Extract text, links, and key takeaways from Instagram/Threads posts (especially carousels) and LinkedIn posts using an already-logged-in Brave/Chrome tab via OpenClaw Browser Relay. Use when the user pastes an Instagram/Threads/LinkedIn URL (or forwards screenshots) and asks something generic like “save it”, “capture this”, “summarize and store”, or “put this in the vault” — you should decide what’s worth saving, classify it into the right pillar, save it to the local KB and/or ingest prompts via the laniameda-kb skill, and also emit a compact JSON payload into a media-agent inbox file for downstream reuse.
12andromeda-messages
>
10deepgram-transcribe
>
10supadata
>
10carousel-orchestrator
>
9