gemini-image

SKILL.md

Gemini Image Generation

Generate high-quality images from text prompts using Google's Gemini and Imagen models through executable scripts.

When to Use This Skill

Use this skill when you need to:

  • Create visual content from text descriptions
  • Generate multiple image variations
  • Create images at specific resolutions (1K, 2K, 4K)
  • Produce images for different aspect ratios (social media, banners, etc.)
  • Generate photorealistic images or artistic visuals
  • Create images with person generation controls
  • Batch generate multiple images at once
  • Combine with text generation for complete content creation

Available Scripts

scripts/generate_image.js

Purpose: Generate images using Gemini 3 Pro Image or Imagen 4 models

When to use:

  • Any image generation task
  • Multiple image generation (1-4 per request)
  • Custom resolution and aspect ratio needs
  • Professional asset creation
  • Photorealistic or artistic image generation

Key parameters:

Parameter Description Example
prompt Text description (required) "A futuristic city at sunset"
--model, -m Model to use gemini-3.1-flash-image-preview
--output-dir, -o Output directory for images images/
--name, -n Base name for output files artwork
--no-timestamp Disable auto timestamp Flag
--aspect, -a Aspect ratio 16:9
--size, -s Resolution 2K or 4K
--num Number of images (1-4) 4
--person Person generation policy allow_adult

Output: List of saved PNG file paths

Workflows

Workflow 1: Basic Image Generation

node scripts/generate_image.js "A futuristic city at sunset with flying cars"
  • Best for: Quick image generation, prototypes
  • Model: gemini-3.1-flash-image-preview (default, Nano Banana 2)
  • Output: images/generated_image_YYYYMMDD_HHMMSS.png

Workflow 2: Social Media (Instagram, Facebook)

node scripts/generate_image.js "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop
  • Best for: Instagram posts, profile pictures
  • Aspect: 1:1 (square format)
  • Resolution: 2K (2048x2048)
  • Output: images/coffee-shop_YYYYMMDD_HHMMSS.png

Workflow 3: YouTube Thumbnails (16:9)

node scripts/generate_image.js "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail
  • Best for: YouTube, video thumbnails
  • Aspect: 16:9 (widescreen)
  • Resolution: 2K (2752x1536)
  • Output: images/thumbnail_YYYYMMDD_HHMMSS.png

Workflow 4: Multiple Variations

node scripts/generate_image.js "Abstract geometric patterns in blue and gold" --num 4 --name abstract
  • Best for: A/B testing, design options
  • Generates: 4 distinct variations
  • Output: images/abstract_YYYYMMDD_HHMMSS_0.png, images/abstract_YYYYMMDD_HHMMSS_1.png, etc.

Workflow 5: Custom Output Directory

node scripts/generate_image.js "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum
  • Best for: Print materials, high-end assets, organized projects
  • Model: gemini-3.1-flash-image-preview or gemini-3-pro-image-preview (for 4K)
  • Resolution: 4K (5504x3072 for 16:9)
  • Directory created automatically if it doesn't exist

Workflow 6: Photorealistic Images (Imagen 4)

node scripts/generate_image.js "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate
  • Best for: Realistic photos, product shots
  • Model: imagen-4.0-generate-001 (photorealistic)
  • Notes: English prompts only
  • Max 4 images per request

Workflow 7: Blog Post Featured Image

node scripts/generate_image.js "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image
  • Best for: Blog headers, article images
  • Combines well with: gemini-text for blog content generation

Workflow 8: Content Creation Pipeline (Text + Image)

# 1. Generate content (gemini-text skill)
node skills/gemini-text/scripts/generate.js "Write a product description for smart home device"

# 2. Generate product image (this skill)
node scripts/generate_image.js "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product

# 3. Create social media post
  • Best for: E-commerce, marketing campaigns
  • Combines with: gemini-text, gemini-batch for batch production

Workflow 9: Disable Timestamp

node scripts/generate_image.js "Fixed filename image" --name my-image --no-timestamp
  • Best for: When you want complete control over filename
  • Output: images/my-image.png (no timestamp)
  • Use when: Generating files for specific naming schemes or automated pipelines

Parameters Reference

Model Selection

Model Nickname Quality Max Size Best For
gemini-3.1-flash-image-preview Nano Banana 2 Pro-level 4K New default, fast + strong quality
gemini-3-pro-image-preview Nano Banana Pro Highest 4K Maximum quality and complex text rendering
gemini-2.5-flash-image Nano Banana Good 2K High-volume, low-latency
imagen-4.0-generate-001 Imagen 4 Photorealistic 2K Realistic photos, product shots

Aspect Ratios

Ratio Use Case 1K Size 2K Size
1:1 Instagram, avatars 1024x1024 2048x2048
16:9 YouTube, presentations 1376x768 2752x1536
9:16 Instagram Stories, TikTok 768x1376 1536x2752
4:3 Traditional displays 1024x768 2048x1536
3:4 Portrait orientation 768x1024 1536x2048
21:9 Ultrawide - 5504x2400

Note: 4K resolution is available with gemini-3.1-flash-image-preview and gemini-3-pro-image-preview

Resolution Guide

Size Use Case Best Model
1K (1024px) Web thumbnails, previews Any model
2K (2048px) Standard web, social media Any model
4K (4096px) Print, high-end assets gemini-3-pro only

Person Generation Policy

Policy Description Restrictions
dont_allow No people in images None
allow_adult Adults only Recommended default
allow_all All ages Restricted in EU, UK, CH, MENA

Output Interpretation

File Naming

  • Default format: {name}_YYYYMMDD_HHMMSS.png (auto timestamp)
  • Single image example: artwork_20260130_031643.png
  • Multiple images: {name}_YYYYMMDD_HHMMSS_0.png, {name}_YYYYMMDD_HHMMSS_1.png, etc.
  • Without timestamp (--no-timestamp): {name}.png
  • Script prints: "Saved: /path/to/file.png"

Image Quality

  • All images include SynthID watermark for authenticity
  • PNG format for lossless quality
  • Can be converted to JPEG/WEBP if needed
  • 4K images are significantly larger file sizes

Error Messages

  • "Model not available": Check model name spelling
  • "Unsupported size": Verify size/model combination
  • "Aspect ratio error": Use supported ratios for selected model

Common Issues

"google-genai or pillow not installed"

cd scripts && npm install

"Image generation failed"

  • Check prompt length (too verbose can fail)
  • Try simpler, more focused prompts
  • Verify model availability in your region
  • Check API quota limits

"Unsupported aspect ratio"

  • Check if ratio is supported by selected model
  • Imagen 4 has fewer ratio options than Gemini
  • Use 16:9 or 1:1 for best compatibility

"4K not supported"

  • 4K works best with gemini-3.1-flash-image-preview or gemini-3-pro-image-preview
  • Use --size 2K for older models
  • Try --model gemini-3.1-flash-image-preview --size 4K

"Imagen prompt language error"

  • Imagen models support English prompts only
  • Use gemini-3.1-flash-image-preview for other languages
  • Translate prompt to English for Imagen

File too large for storage

  • Use --size 1K for smaller files
  • Compress images after generation
  • Convert PNG to JPEG for web use

Best Practices

Prompt Engineering

  • Be specific and descriptive
  • Include style descriptors (e.g., "photorealistic", "digital art")
  • Mention lighting, mood, and composition
  • Use analogies for complex concepts
  • Avoid negative prompts (describe what you want, not what to avoid)

Model Selection

  • Use gemini-3.1-flash-image-preview for: Best default balance, quality, speed, 4K
  • Use gemini-3-pro-image-preview for: Maximum quality, complex text rendering
  • Use gemini-2.5-flash-image for: Speed, high volume
  • Use imagen-4.0-generate-001 for: Photorealism, product shots

Performance Optimization

  • Generate multiple images at once with --num
  • Use lower resolution for previews
  • Batch requests for high-volume needs (gemini-batch skill)
  • Cache results for repeated requests

Quality Tips

  • Use 2K resolution for most web uses
  • 4K only when maximum detail is needed
  • Combine specific prompts with style guidance
  • Test prompts with --num 1 before generating batches

Cost Management

  • Use flash models for cost efficiency
  • 4K generation costs significantly more
  • Batch multiple requests when possible
  • Generate at 1K for testing, 2K/4K for final

Related Skills

  • gemini-text: Generate text content alongside images
  • gemini-tts: Create audio for image-based content
  • gemini-batch: Process multiple image requests efficiently
  • gemini-embeddings: Generate image embeddings for similarity search

Quick Reference

# Basic
node scripts/generate_image.js "Your prompt"

# Social media (1:1)
node scripts/generate_image.js "Prompt" --aspect 1:1 --size 2K --name social-post

# YouTube thumbnail (16:9)
node scripts/generate_image.js "Prompt" --aspect 16:9 --size 2K --name thumbnail

# 4K high quality
node scripts/generate_image.js "Prompt" --aspect 16:9 --size 4K --name high-res

# Multiple variations
node scripts/generate_image.js "Prompt" --num 4 --name variations

# Custom directory
node scripts/generate_image.js "Prompt" --output-dir ./my-images/ --name custom

# Photorealistic
node scripts/generate_image.js "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo

# No timestamp
node scripts/generate_image.js "Prompt" --name fixed-name --no-timestamp

Reference

Weekly Installs
34
GitHub Stars
1
First Seen
Jan 29, 2026
Installed on
opencode32
gemini-cli32
github-copilot30
codex30
cursor30
amp29