nano-banana
Nano Banana
Generate high-quality presentation slides as images using Gemini's image generation API, review them interactively in a browser, and iteratively edit based on feedback.
When to Use This Skill
- User asks to create a presentation, slide deck, or PPT
- User wants to generate visual slides for a talk or lecture
- User has a document or outline and wants slides based on it
- User says "make me a PPT", "generate slides", "create a presentation"
- User wants to edit or refine existing generated slides
- User needs high-quality figures, diagrams, or illustrations for papers or documents
- User asks to generate research figures, architecture diagrams, or concept illustrations
Do NOT use for:
- Writing academic papers → use
paper-writing - Planning academic conference talk narrative structure → use
academic-slides
Before You Start: Prerequisites
Before proceeding with any slide generation, verify these prerequisites:
-
API Key: Check that a Google API key is available. Run:
echo $GOOGLE_API_KEYIf empty, ask the user to provide one. They can either:
- Set it via config:
EvoSci config set google_api_key <key> - Provide it directly (pass via
--api-keyargument) - If the user provides the key in conversation, pass it to scripts with
--api-key
- Set it via config:
-
Language: Ask the user what language the slide content should be in. This affects the content you write in
slides_plan.json, not the style template.
Core Workflow
Phase 1: Content Planning Conversation ← most important phase
Phase 2: Generate slides_plan.json
Phase 3: Select Style & Generate Slides
Phase 4: Launch Review Server
Phase 5: Apply Feedback Edits ← repeat Phase 4-5 until satisfied
Phase 6: Package as PPTX
Phase 7: Cleanup
Follow these phases in order. Do NOT skip Phase 1 — the quality of generated slides depends directly on planning depth.
Phase 1: Content Planning Conversation
This is the most critical phase. Rushing to generation without proper planning produces mediocre slides. Engage the user in a structured conversation:
Step 1 — Understand the context:
- What is the topic of the presentation?
- Who is the audience? (technical peers, executives, students, general public)
- How long is the talk? (this determines page count)
- What is the occasion? (conference, internal talk, lecture, pitch)
Step 2 — Define the storyline:
- What is the opening hook? (a surprising fact, a question, a trend)
- What are the 3-5 main sections or arguments?
- What is the key takeaway the audience should remember?
- What is the closing message?
Step 3 — Outline per-page content:
- For each slide, agree on: title + 2-4 key points + visual description
- Identify which slides are cover, content, or data type
- Ensure logical flow between pages
Duration-to-page-count guidance:
| Duration | Pages | Structure |
|---|---|---|
| 5 min | 5 | Cover + 3 content + closing |
| 10-15 min | 8-12 | Cover + intro + 3-4 sections + summary + closing |
| 20-30 min | 15-20 | Cover + intro + 5-6 sections + summary + closing |
| 45-60 min | 25-30 | Cover + intro + 7-9 sections (2-3 pages each) + summary + closing |
If the user provides a document or outline, read it thoroughly, then propose a slide breakdown for approval before proceeding.
Phase 2: Generate slides_plan.json
Create a slides_plan.json file in the workspace root with this schema:
{
"title": "Presentation Title",
"total_slides": 10,
"slides": [
{
"slide_number": 1,
"page_type": "cover",
"content": "Title: My Presentation\nSubtitle: A subtitle here\nLabel: 2026 Edition"
},
{
"slide_number": 2,
"page_type": "content",
"content": "Title: First Topic\nKey points:\n- Point one\n- Point two\n- Point three"
},
{
"slide_number": 3,
"page_type": "data",
"content": "Title: Key Metrics\nMetric 1: 95% accuracy\nMetric 2: 3x faster\nMetric 3: 10k users"
}
]
}
page_type values: cover, content, data
Critical Content Field Rules
The content field is what gets passed to the image generation model. Follow these rules strictly:
- DO write descriptive titles and bullet points
- DO describe the visual layout you want (e.g., "left-right comparison", "4 icon cards")
- DO NOT prefix lines with "Slogan:", "Visual:", "Points:", or any meta-labels — the model will render these as visible text on the slide
- DO NOT put the same sentence in both the title area and the bottom of the content — it causes duplication
- DO NOT include footer text, page numbers, or watermark instructions
Bad example (meta-labels leak as visible text):
Title: Why AI Matters
Visual: left-right comparison chart
Points:
- Point one
- Point two
Slogan: AI changes everything
Good example (clean, no meta-labels):
Title: Why AI Matters
Visual layout: left-right comparison chart showing traditional vs AI approach
Key points:
- Point one with brief explanation
- Point two with brief explanation
Bottom tagline: AI changes everything
Phase 3: Select Style & Generate Slides
Available Styles
| Style | File | Visual Characteristics | Best For |
|---|---|---|---|
| Lineal Color | styles/lineal-color.md |
White background, teal accents, flat 2D icons, info cards | Technical talks, lectures, educational |
| Gradient Glass | styles/gradient-glass.md |
Light pastel background, frosted glass cards, Apple Keynote feel | Product launches, pitches, SaaS |
| Vector Illustration | styles/vector-illustration.md |
Cream background, black outlines, retro colors, toy-model charm | Educational, children's content, brand stories |
Present the styles to the user and let them choose. If unsure, recommend Lineal Color as the default.
Available Models
| Model | Speed | Quality | When to Use |
|---|---|---|---|
gemini-3-pro-image-preview |
Moderate | Best | Final version, important presentations |
gemini-3.1-flash-image-preview |
Fast | Good | Drafts, rapid iteration, large decks |
gemini-2.5-flash-image |
Fastest | Basic | Quick prototypes, bulk generation |
For first-time generation, recommend gemini-3.1-flash-image-preview (fast iteration). Switch to gemini-3-pro-image-preview for the final version.
Generate Command
python /skills/nano-banana/scripts/generate_ppt.py \
--plan slides_plan.json \
--style /skills/nano-banana/styles/lineal-color.md \
--model gemini-3.1-flash-image-preview \
--output ppt_output
Arguments:
--plan(required): Path to slides_plan.json--style(required): Path to style template--model: Image generation model (default:gemini-3-pro-image-preview)--resolution:2K(default) or4K--output: Output directory (default:ppt_output/TIMESTAMP)--api-key: Google API key (if not in environment)--workers: Number of parallel workers (default: 1, recommended: 3-5 for large decks)
Output structure:
ppt_output/
├── images/
│ ├── slide-01.png
│ ├── slide-02.png
│ └── ...
├── prompts.json # All prompts used (for debugging)
└── index.html # Browser viewer
Phase 4: Launch Review Server
Start the interactive review server so the user can review slides and write feedback:
python /skills/nano-banana/scripts/serve_viewer.py \
--dir ppt_output \
--plan slides_plan.json \
--port 8080 \
--pid-file .viewer.pid
Tell the user:
Review server is running at http://localhost:8080. Open it in your browser to review each slide. Write feedback in the text box below any slide that needs changes, then click "Save Feedback". Tell me when you're done.
The server saves feedback directly into slides_plan.json as a feedback field on each slide.
Wait for the user to confirm they have saved their feedback before proceeding.
Phase 5: Apply Feedback Edits
Read slides_plan.json and find all slides with a non-empty feedback field. For each one, run the edit script:
python /skills/nano-banana/scripts/edit_slide.py \
--input ppt_output/images/slide-{NUMBER}.png \
--instruction "{FEEDBACK_TEXT}" \
--output ppt_output/images/slide-{NUMBER}.png \
--model gemini-3.1-flash-image-preview
Arguments:
--input(required): Path to the original slide image--instruction(required): The edit instruction (from feedback field)--output: Output path (default: overwrite input)--model: Image generation model--api-key: Google API key (if not in environment)
After editing all slides with feedback, clear the feedback fields from slides_plan.json and tell the user to refresh the browser to see updated slides.
If the user has more feedback, repeat Phase 4-5. This review-edit cycle continues until the user is satisfied.
Phase 6: Package as PPTX
Once the user approves all slides, ask for the desired filename and package them:
python /skills/nano-banana/scripts/package_pptx.py \
--dir ppt_output/images \
--output presentation.pptx \
--kill-server .viewer.pid
Arguments:
--dir(required): Directory containing slide-XX.png images--output(required): Output .pptx file path--kill-server: PID file from serve_viewer.py — automatically stops the review server after packaging
Phase 7: Cleanup
- The review server is automatically stopped by
package_pptx.py --kill-server - Ask the user if they want to keep
ppt_output/directory or clean it up - The
slides_plan.jsoncan be kept for future re-generation
Counterintuitive Rules
-
Never include meta-labels in content — Words like "Slogan:", "Visual:", "Points:" will be rendered as visible text on the slide. Describe what you want without prefixes.
-
Content describes WHAT, not HOW — The style template handles visual layout. The content field should focus on text and logical structure, not colors or positioning.
-
More planning = better slides — Spending 10 minutes on Phase 1 conversation saves hours of re-generation. Do not rush to Phase 3.
-
Edit, don't regenerate — When a slide needs minor changes (text fix, color change, remove footer), use
edit_slide.pyinstead of regenerating from scratch. Editing preserves visual consistency. -
Use flash model for drafts —
gemini-3.1-flash-image-previewis fast enough for iteration. Only switch togemini-3-pro-image-previewfor the final version after all feedback is addressed. -
Never read generated images yourself — Not all models support multimodal input. Do NOT use
read_fileon generated PNG images to check quality. Always launch the review server and let the user inspect slides visually in the browser. The user's feedback is your only quality signal. -
One idea per slide — Do not pack multiple concepts into a single slide. If a slide has more than 4 bullet points, split it into two slides.
-
Bottom taglines should not repeat the title — If the title says "Why AI Matters", the bottom tagline should add new insight, not restate the title.
Scripts Reference
| Script | Purpose | Key Arguments |
|---|---|---|
scripts/generate_ppt.py |
Batch generate all slides from plan | --plan, --style, --model, --output, --resolution, --api-key, --workers |
scripts/edit_slide.py |
Edit a single slide based on instruction | --input, --instruction, --output, --model, --api-key |
scripts/serve_viewer.py |
Local review server with feedback | --dir, --plan, --port, --no-open, --pid-file |
scripts/package_pptx.py |
Package slide images into .pptx | --dir, --output, --kill-server |
Style Template Format
Style templates are markdown files in styles/ with a fixed structure that generate_ppt.py parses:
| Section | Purpose | Parsed by Code |
|---|---|---|
## Base Prompt |
Visual specifications shared by all slides | Yes — injected into every prompt |
## Page Templates |
Layout descriptions per page type | Fallback only |
## Examples |
Actual prompt templates with {Base Prompt} and [Content] placeholders |
Yes — primary templates |
| Other sections | Documentation only | No |
To create a new style: copy an existing .md file, modify the ## Base Prompt and ## Examples sections. The code extracts ### Cover, ### Content, and ### Data code blocks from ## Examples.
More from evoscientist/evoskills
paper-review
Guides self-review of YOUR OWN academic paper before submission with adversarial stress-testing. Core method: 5-aspect checklist (contribution sufficiency, writing clarity, results quality, testing completeness, method design), counterintuitive protocol (reject-first simulation, delete unsupported claims, score trust, promote limitations, attack novelty), reverse-outlining, and figure/table quality checks. Use when: user wants to self-review or self-check their own paper draft before submission, stress-test their claims, prepare for reviewer criticism, or mentions 'self-review', 'check my draft', 'is my paper ready'. Do NOT use for writing a peer review of someone else's paper, and do NOT use after receiving actual reviews (use paper-rebuttal instead).
275paper-writing
Guides writing academic papers section by section using an 11-step workflow with LaTeX templates and counterintuitive writing tactics. Covers Abstract, Introduction, Method, Experiments, Related Work, Conclusion, and Supplementary. Use when: user asks to write or draft a paper section, needs LaTeX templates, wants to improve academic writing quality, optimize novelty framing, or mentions 'write introduction', 'draft method', 'paper writing'. Do NOT use for pre-submission review (use paper-review), experiment execution (use experiment-pipeline), or paper planning/story design (use paper-planning).
256paper-rebuttal
Guides writing effective rebuttals after receiving peer review feedback. Covers review diagnosis (score-driven color-coding), response strategy (champion identification, common-theme consolidation), tactical writing (18 rules), and counterintuitive rebuttal principles. Use when: user received reviewer scores/comments, needs to write a rebuttal or author response, wants to respond to specific criticism (e.g. 'limited novelty', 'missing baselines'), mentions 'rebuttal', 'reviewer comments', 'author response', or 'respond to reviewers'. Do NOT use for pre-submission self-review (use paper-review instead).
250paper-planning
Guides pre-writing planning for academic papers with 4 structured steps: story design (task-challenge-insight-contribution-advantage), experiment planning (comparisons + ablations), figure design (pipeline + teaser), and 4-week timeline management. Includes counterintuitive planning tactics (write a mock rejection letter to identify weaknesses before writing, narrow before broad claims, design ablations first). Use when: user wants to plan a paper before writing, design story/contributions, plan experiments, create figure sketches, set a writing timeline, or write a pre-emptive rejection letter for planning purposes. Do NOT use for actual writing (use paper-writing), running experiments (use experiment-pipeline), self-reviewing a finished draft (use paper-review), or finding research problems (use research-ideation).
246research-ideation
End-to-end research ideation pipeline: literature grounding → multi-track idea generation (3 personas: innovator/pragmatist/critic) → iterative refinement → ELO tournament ranking → update evo-memory (IDE) → user selects direction → expand into manuscript-quality proposal. Use when: user wants to find a research direction, brainstorm ideas, evaluate idea novelty, design a novel solution, rank/compare research ideas, or generate a research proposal. Do NOT use for finding/searching/reading papers (use paper-navigator), literature survey reports (use research-survey), or planning a paper (use paper-planning).
244experiment-pipeline
Guides structured 4-stage experiment execution with attempt budgets and gate conditions: Stage 1 initial implementation (reproduce baseline), Stage 2 hyperparameter tuning, Stage 3 proposed method validation, Stage 4 ablation study. Integrates with evo-memory (load prior strategies, trigger IVE/ESE) and experiment-craft (5-step diagnostic on failure). Use when: user has a planned experiment, needs to reproduce baselines, organize experiment workflow, or systematically validate a method. Do NOT use for debugging a specific experiment failure (use experiment-craft) or designing which experiments to run (use paper-planning).
235