image
Nano Banana - Image Generation
Overview
Generate and edit images using state-of-the-art AI models via the Google GenAI SDK. Perfect for creating visual assets, concept art, illustrations, and editing existing images.
Key Features:
- Multiple Models: Nano Banana 2 (fast), Nano Banana Pro (quality)
- Image Editing: Modify existing images with natural language
- Aspect Ratio & Resolution: Control output dimensions (16:9, 4K, etc.)
- Simple API: One command to generate or edit
- Automatic Saving: Handles file formats automatically
When to Use This Skill
Use this skill when you need:
- Visual Assets: Icons, illustrations, backgrounds
- Concept Art: Ideas and visual explorations
- Marketing Materials: Product mockups, social media images
- Photo Editing: Modify existing images with AI
- Creative Content: Artistic images, abstract visuals
- Presentation Graphics: Visuals for slides and documents
Note: For technical diagrams (architecture, flowcharts, ERD), use the diagram skill instead -- it includes quality review and iteration.
Quick Start
# Generate a new image
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "A beautiful sunset over mountains with orange and purple sky" -o sunset.png
# Edit an existing image
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Make the sky more dramatic with storm clouds" --input sunset.png -o dramatic_sunset.png
# Use Nano Banana Pro for highest quality
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Abstract geometric art in blue and gold" -m gemini-3-pro-image-preview -o abstract.png
Aspect Ratio & Resolution
# Wide cinematic landscape
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Cinematic mountain vista" -o landscape.png --aspect-ratio 16:9
# High-resolution square image
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Product photo" -o product.png --aspect-ratio 1:1 --resolution 4K
# Ultra-wide banner
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Website hero banner" -o banner.png --aspect-ratio 21:9 --resolution 2K
# Use Nano Banana Pro for highest quality
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Professional headshot" -o headshot.png -m gemini-3-pro-image-preview
Supported aspect ratios: 1:1, 1:4, 1:8, 2:3, 3:2, 3:4, 4:1, 4:3, 4:5, 5:4, 8:1, 9:16, 16:9, 21:9 Supported resolutions: 512, 1K, 2K, 4K
Editing Existing Images
Use /nano-banana:edit to modify an existing image, or call the script directly:
# Edit via command (recommended)
/nano-banana:edit sunset.png "Add dramatic storm clouds and lightning"
# Edit via script directly
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Add dramatic storm clouds" --input sunset.png -o sunset_edit1.png
When to edit vs. regenerate:
- Edit when the base image is good but needs specific changes (add/remove elements, change colors, modify style)
- Regenerate when the image fundamentally doesn't match what you need
Available Models
| Model | ID | Speed | Capabilities | Best For |
|---|---|---|---|---|
| Nano Banana 2 | gemini-3.1-flash-image-preview |
Flash (fastest) | Generation + Editing | Default -- high-volume, general use |
| Nano Banana Pro | gemini-3-pro-image-preview |
Pro | Generation + Editing | Professional assets, best quality |
Usage Examples
Generate New Images
# Photorealistic
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Professional headshot of a business executive in modern office setting" -o headshot.png
# Artistic
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Watercolor painting of a cozy coffee shop on a rainy day" -o coffee_shop.png
# Abstract
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Abstract visualization of data flowing through neural networks, blue and cyan colors" -o neural_flow.png
# Product
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Modern minimalist logo for a tech startup called 'Nexus', clean geometric design" -o logo.png
Edit Existing Images
# Change colors
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Change the car color to red" --input car.jpg -o red_car.png
# Add elements
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Add a rainbow in the sky" --input landscape.jpg -o rainbow_landscape.png
# Remove elements
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Remove the person from the background" --input photo.jpg -o clean_photo.png
# Style transfer
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Make this look like a watercolor painting" --input photo.jpg -o watercolor.png
Specify Output Format
# PNG (default, best for graphics with transparency)
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Icon of a rocket ship" -o rocket.png
# Output to specific directory
python3 ${CLAUDE_SKILL_DIR}/scripts/generate_image.py "Banner image" -o assets/images/banner.png
Configuration
Google Gemini API Key (Required)
export GEMINI_API_KEY='your_gemini_key_here'
Get a key at https://aistudio.google.com/apikey (free tier available).
.env File
Create a .env file in your project:
GEMINI_API_KEY=your_gemini_key_here
Python API
from skills.image.scripts.generate_image import generate_image
# Generate new image
result = generate_image(
prompt="A futuristic city at night with neon lights",
output_path="city.png",
model="gemini-3-pro-image-preview"
)
# Edit existing image
result = generate_image(
prompt="Add flying cars to the scene",
output_path="city_with_cars.png",
input_image="city.png"
)
Tips for Better Images
Be Descriptive
# Too vague
"A dog"
# Detailed
"A golden retriever puppy playing in autumn leaves, warm afternoon sunlight, shallow depth of field, professional pet photography"
Include Style
# Specify artistic style
"A mountain landscape in the style of traditional Japanese ink painting, minimalist, black and white with subtle gray tones"
Specify Composition
# Include framing
"Close-up portrait of an owl, centered composition, soft studio lighting, dark background, sharp focus on the eyes"
For Editing, Be Specific
# Vague edit
"Make it better"
# Specific edit
"Increase the contrast, make the colors more vibrant, and add a subtle vignette effect"
Comparison: image vs diagram Skills
| Aspect | image Skill |
diagram Skill |
|---|---|---|
| Use Case | Photos, art, illustrations | Technical diagrams |
| Quality Review | No | Yes (Gemini 3 Pro) |
| Iteration | Single pass | Smart iteration (1-2 passes) |
| Doc Types | N/A | 13 document types with thresholds |
| Image Editing | Yes | Yes |
| Best For | Creative visuals | Architecture, flowcharts, ERD |
Rule of thumb: If it's a technical diagram with boxes, arrows, and labels, use diagram. If it's a photo, illustration, or artistic image, use image.
Troubleshooting
"GEMINI_API_KEY not found"
Set the environment variable or create a .env file. See Configuration section.
"Image file not found" (for editing)
Make sure the input image path is correct and the file exists.
Unexpected Output
- Try a different model
- Add more detail to your prompt
- Be more specific about style, composition, and colors
Generation Timeout
Large or complex images may take up to 2 minutes. Timeout is set to 120 seconds.
Cost Considerations
- Nano Banana 2 (Flash): Cheapest -- free tier available, ~$0.01-0.05 per image
- Nano Banana Pro: ~$0.02-0.10 per image
- Image editing: Similar to generation costs
More from flight505/nano-banana
diagram
Generate publication-quality technical diagrams using Nano Banana Pro (gemini-3-pro-image-preview) with AI-powered quality review. Smart iteration only regenerates when quality is below threshold. Supports style presets (technical, visual-abstract, minimal), aspect ratio, and resolution control (512-4K).
2video
Generate videos using Veo 3.1 — text-to-video, image-to-video, frame interpolation, and video extension
2kroki
Render text-based diagrams (Mermaid, PlantUML, GraphViz, D2, and 23 more) to PNG/SVG via Kroki.io. Use ONLY when the user explicitly asks for text-based diagram rendering or a specific diagram language.
2visual-abstract
Create Nature-quality visual abstracts — scientific figures using visual metaphors, isometric depth, and physical analogies to convey complex technical systems. Use for README hero images, paper figures, blog graphics, or when the user wants diagrams that go beyond boxes and arrows. Triggers on: 'visual abstract', 'scientific figure', 'Nature-quality', 'publication graphic', 'infographic', 'visual metaphor', or requests for rich/expressive/artistic diagrams.
2