Gemini 3 Pro Image Generation (Nano Banana Pro)

Comprehensive guide for generating images with Gemini 3 Pro Image (gemini-3-pro-image-preview), also known as Nano Banana Pro. This skill focuses on IMAGE OUTPUT (generating images) - see gemini-3-multimodal for INPUT (analyzing images).

Overview

Gemini 3 Pro Image (Nano Banana Pro 🍌) is Google's image generation model featuring native 4K support, text rendering within images, grounded generation with Google Search, and conversational editing capabilities.

Key Capabilities

4K Resolution: Native 4K generation with upscaling to 2K/4K
Text Rendering: High-quality text within images
Grounded Generation: Fact-verified images using Google Search
Conversational Editing: Multi-turn image modification preserving context
Aspect Ratios: Supports 16:9 and custom ratios at 4K
Quality Control: Fine-tuned generation parameters

When to Use This Skill

Generating images from text prompts
Creating 4K resolution images
Rendering text within images
Fact-verified image generation (grounded)
Conversational image editing
Multi-turn image refinement
Custom aspect ratio images

Quick Start

Prerequisites

Gemini API setup (see gemini-3-pro-api skill)
Model: gemini-3-pro-image-preview

Python Quick Start

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

# Use the image generation model
model = genai.GenerativeModel("gemini-3-pro-image-preview")

# Generate image
response = model.generate_content("A serene mountain landscape at sunset")

# Save image
if response.parts:
    with open("generated_image.png", "wb") as f:
        f.write(response.parts[0].inline_data.data)
    print("Image saved!")

Node.js Quick Start

import { GoogleGenerativeAI } from "@google/generative-ai";
import fs from "fs";

const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-3-pro-image-preview" });

const result = await model.generateContent("A serene mountain landscape at sunset");
const imageData = result.response.parts[0].inlineData.data;

fs.writeFileSync("generated_image.png", Buffer.from(imageData, "base64"));
console.log("Image saved!");

Core Tasks

Task 1: Generate Image from Text Prompt

Goal: Create high-quality images from text descriptions.

Python Example:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

model = genai.GenerativeModel(
    "gemini-3-pro-image-preview",
    generation_config={
        "thinking_level": "high",  # Best quality
        "temperature": 1.0
    }
)

# Generate image
prompt = """A futuristic cityscape at night with:
- Neon lights and holographic advertisements
- Flying vehicles
- Tall skyscrapers with unique architecture
- Rain-slicked streets reflecting the lights
- Cinematic, detailed, 4K quality"""

response = model.generate_content(prompt)

# Save image
if response.parts and hasattr(response.parts[0], 'inline_data'):
    image_data = response.parts[0].inline_data.data
    with open("futuristic_city.png", "wb") as f:
        f.write(image_data)
    print("Image generated successfully!")
else:
    print("No image generated")

Tips for Better Prompts:

Be specific and detailed
Specify art style (realistic, cartoon, oil painting, etc.)
Include lighting, mood, and atmosphere
Mention quality level (4K, detailed, high-quality)
Describe colors, textures, composition

See: references/generation-guide.md for comprehensive prompting techniques

Task 2: Generate 4K Images

Goal: Create high-resolution 4K images with upscaling.

Python Example:

# Generate with 4K quality specification
prompt = """A photorealistic portrait of a scientist in a modern lab:
- 4K ultra-high definition
- Sharp focus on subject
- Soft bokeh background
- Professional studio lighting
- Fine detail in textures
- Cinema-grade quality"""

response = model.generate_content(prompt)

# 4K image will be generated
if response.parts:
    with open("scientist_4k.png", "wb") as f:
        f.write(response.parts[0].inline_data.data)

4K Features:

Native 4K resolution support
Upscaling to 2K/4K
16:9 aspect ratio at 4K
Enhanced detail and clarity

See: references/resolution-guide.md for resolution control

Task 3: Render Text in Images

Goal: Generate images with readable, high-quality text.

Python Example:

prompt = """Create a professional business card design with:
- Company name: "TechVision AI"
- Text: "Dr. Sarah Chen"
- Text: "Chief AI Officer"
- Text: "sarah.chen@techvision.ai"
- Text: "+1 (555) 123-4567"
- Modern, clean design
- Professional fonts
- Blue and white color scheme
- All text clearly readable"""

response = model.generate_content(prompt)

if response.parts:
    with open("business_card.png", "wb") as f:
        f.write(response.parts[0].inline_data.data)

Text Rendering Best Practices:

Explicitly specify text content in quotes
Request "readable" or "clearly visible" text
Keep text short and simple
Specify font style if desired
Use high contrast backgrounds

See: references/generation-guide.md for text rendering techniques

Task 4: Grounded Generation (Fact-Verified Images)

Goal: Generate factually accurate images using Google Search grounding.

Python Example:

# Enable Google Search grounding for factual accuracy
model_grounded = genai.GenerativeModel(
    "gemini-3-pro-image-preview",
    tools=[{"google_search_retrieval": {}}]  # Enable grounding
)

prompt = """Generate an accurate image of the International Space Station
with Earth in the background. Use current ISS configuration."""

response = model_grounded.generate_content(prompt)

if response.parts:
    with open("iss_grounded.png", "wb") as f:
        f.write(response.parts[0].inline_data.data)

    # Check if grounding was used
    if hasattr(response, 'grounding_metadata'):
        print(f"Grounding sources used: {len(response.grounding_metadata.grounding_chunks)}")

Grounded Generation Use Cases:

Historical scenes (accurate to period)
Scientific visualizations
Current events
Famous landmarks
Product representations

Benefits:

Factual accuracy
Real-world grounding
Reduced hallucination
Up-to-date information

Note: Uses free Google Search quota (1,500 queries/day)

See: references/grounded-generation.md for comprehensive guide

Task 5: Conversational Image Editing

Goal: Iteratively refine images through multi-turn conversation.

Python Example:

model = genai.GenerativeModel("gemini-3-pro-image-preview")

# Start a chat session for conversational editing
chat = model.start_chat()

# First generation
response1 = chat.send_message("Create a cozy coffee shop interior")

if response1.parts:
    with open("coffee_shop_v1.png", "wb") as f:
        f.write(response1.parts[0].inline_data.data)

# Refine the image
response2 = chat.send_message("Add more plants and warm lighting")

if response2.parts:
    with open("coffee_shop_v2.png", "wb") as f:
        f.write(response2.parts[0].inline_data.data)

# Further refinement
response3 = chat.send_message("Make it more minimalist, remove some decorations")

if response3.parts:
    with open("coffee_shop_v3.png", "wb") as f:
        f.write(response3.parts[0].inline_data.data)

Conversational Editing Features:

Preserves visual context across turns
Incremental modifications
Natural language instructions
Multi-turn refinement
Context-aware changes

Example Editing Commands:

"Make it darker/lighter"
"Add more [element]"
"Change the color scheme to [colors]"
"Make it more realistic/artistic"
"Remove [element]"

See: references/conversational-editing.md for advanced patterns

Task 6: Custom Aspect Ratios

Goal: Generate images in specific aspect ratios.

Python Example:

# 16:9 aspect ratio (4K supported)
prompt_169 = "A cinematic landscape in 16:9 aspect ratio, 4K quality"

# Square aspect ratio
prompt_square = "A square logo design for a tech company"

# Portrait orientation
prompt_portrait = "A portrait-oriented movie poster"

response = model.generate_content(prompt_169)
# Image will be generated in specified ratio

Supported Ratios:

16:9 - Wide, cinematic (4K supported)
1:1 - Square
4:3 - Standard
9:16 - Vertical/portrait

Task 7: Optimize Image Generation Costs

Goal: Balance quality and cost for image generation.

Pricing:

Text Input: $1-2 per 1M tokens
Text Output: $6-9 per 1M tokens
Image Output: $0.134 per image (varies by resolution)

Python Cost Optimization:

def generate_with_cost_tracking(prompt):
    """Generate image and track costs"""

    response = model.generate_content(prompt)

    # Calculate cost
    usage = response.usage_metadata
    input_cost = (usage.prompt_token_count / 1_000_000) * 2.00
    output_cost = (usage.candidates_token_count / 1_000_000) * 9.00
    image_cost = 0.134  # Per image

    total_cost = input_cost + output_cost + image_cost

    print(f"Input tokens: {usage.prompt_token_count} (${input_cost:.6f})")
    print(f"Output tokens: {usage.candidates_token_count} (${output_cost:.6f})")
    print(f"Image cost: ${image_cost:.6f}")
    print(f"Total: ${total_cost:.6f}")

    return response

response = generate_with_cost_tracking("A beautiful sunset over mountains")

Cost Optimization Strategies:

Batch Requests: Generate multiple images in one session
Reuse Chat Sessions: Conversational editing is more efficient
Specific Prompts: Clear prompts reduce regeneration needs
Monitor Usage: Track costs per project
Use Appropriate Quality: Not all images need 4K

See: references/pricing-optimization.md for detailed strategies

Batch Image Generation

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro-image-preview")

prompts = [
    "A serene mountain lake at dawn",
    "A bustling market in Morocco",
    "A futuristic robot assistant",
    "An abstract geometric pattern"
]

for i, prompt in enumerate(prompts):
    print(f"Generating image {i+1}/{len(prompts)}: {prompt}")

    response = model.generate_content(prompt)

    if response.parts:
        with open(f"generated_{i+1}.png", "wb") as f:
            f.write(response.parts[0].inline_data.data)
        print(f"  Saved: generated_{i+1}.png")

Error Handling

from google.api_core import exceptions

def safe_image_generation(prompt):
    """Generate image with error handling"""

    try:
        response = model.generate_content(prompt)

        if not response.parts:
            return {"success": False, "error": "No image generated"}

        if not hasattr(response.parts[0], 'inline_data'):
            return {"success": False, "error": "Invalid response format"}

        return {
            "success": True,
            "image_data": response.parts[0].inline_data.data,
            "mime_type": response.parts[0].inline_data.mime_type
        }

    except exceptions.InvalidArgument as e:
        return {"success": False, "error": f"Invalid prompt: {e}"}
    except exceptions.ResourceExhausted as e:
        return {"success": False, "error": f"Rate limit exceeded: {e}"}
    except Exception as e:
        return {"success": False, "error": f"Error: {e}"}

References

Core Guides

Model Setup - Nano Banana Pro configuration
Generation Guide - Comprehensive prompting techniques
Grounded Generation - Fact-verified image creation
Conversational Editing - Multi-turn refinement

Optimization

Resolution Guide - 4K and quality control
Pricing Optimization - Cost management

Scripts

Generate Image Script - Production-ready generation
Grounded Generation Script - Fact-verified images
Edit Image Script - Conversational editing

Official Resources

Related Skills

gemini-3-pro-api - Basic setup, authentication, text generation
gemini-3-multimodal - Image INPUT (analyzing images)
gemini-3-advanced - Advanced features (caching, batch, tools)

Best Practices

Be Specific: Detailed prompts produce better results
Specify Quality: Request 4K or high quality explicitly
Use Grounding: Enable for factual accuracy
Iterate Conversationally: Use chat for refinements
Monitor Costs: Track usage, especially for 4K
Handle Errors: Implement retry logic
Save Images Properly: Use binary mode for writing

Troubleshooting

Issue: No image generated

Solution: Check response.parts exists and has inline_data attribute

Issue: Low quality images

Solution: Add "4K", "high quality", "detailed" to prompt

Issue: Text in images unreadable

Solution: Specify text explicitly in quotes, request "readable text"

Issue: Images not factually accurate

Solution: Enable grounded generation with Google Search

Issue: High costs

Solution: Optimize prompts, batch requests, monitor usage

Summary

This skill provides complete image generation capabilities:

✅ Text-to-image generation ✅ Native 4K support ✅ Text rendering in images ✅ Grounded generation (fact-verified) ✅ Conversational editing ✅ Custom aspect ratios ✅ Cost optimization ✅ Production-ready examples

Ready to generate images? Start with Task 1: Generate Image from Text Prompt above!

gemini-3-image-generation

Gemini 3 Pro Image Generation (Nano Banana Pro)

Overview

Key Capabilities

When to Use This Skill

Quick Start

Prerequisites

Python Quick Start

Node.js Quick Start

Core Tasks

Task 1: Generate Image from Text Prompt

Task 2: Generate 4K Images

Task 3: Render Text in Images

Task 4: Grounded Generation (Fact-Verified Images)

Task 5: Conversational Image Editing

Task 6: Custom Aspect Ratios

Task 7: Optimize Image Generation Costs

Batch Image Generation

Error Handling

References

Related Skills

Best Practices

Troubleshooting

Issue: No image generated

Issue: Low quality images

Issue: Text in images unreadable

Issue: Images not factually accurate

Issue: High costs

Summary