slide-builder
Slide Builder
Transform essay-to-speech output into complete, presentation-ready slide decks with multiple output format support (HTML, Remotion video, PowerPoint).
When to Use
Invoke when user:
- Has essay-to-speech output and wants slides
- Says "create slides from this talk track"
- Needs to "build a presentation" from spoken content
- Wants to convert a talk track to video format
- Uses
/slide-buildercommand - Asks for "presentation slides" from transformed essay content
Prerequisites
Input required: Output from the essay-to-speech skill containing:
### Originalsections (verbatim essay text)### Talk Tracksections with semantic tags### Imagessections with ratings (USE/ADAPT/RECREATE/SKIP)### Slide Ideassuggestions
Core Process
1. Parse Essay-to-Speech Output
Extract structured data from each section:
Section → {
title: string,
original: string,
talkTrack: TaggedContent[],
images: ImageAssessment[],
slideIdeas: string[]
}
Semantic tags to identify:
[HOOK]- Opening attention-grabber → Title/hook slide[KEY_POINT]- Core argument → Statement slide[EVIDENCE]- Data/proof → Data visualization slide[STORY]- Narrative → Story/quote slide[TRANSITION]- Bridge → Section divider or no slide[CALLBACK]- Reference → Recap element[LANDING]- Conclusion → Summary slide[CTA]- Call to action → Action slide
2. Plan Slide Deck
Map semantic tags to slides:
| Tag | Slide Type | Typical Visual |
|---|---|---|
[HOOK] |
Title/Opening | Bold statement, striking image |
[KEY_POINT] |
Statement | Single phrase, minimal graphic |
[EVIDENCE] |
Data | Chart, statistic callout, comparison |
[STORY] |
Story | Photo, quote attribution, timeline |
[TRANSITION] |
Divider (optional) | Section title, progress indicator |
[CALLBACK] |
Recap | Reference to earlier slide |
[LANDING] |
Summary | Key takeaways, visual recap |
[CTA] |
Action | Contact info, next steps, QR code |
Slide count heuristic:
- 1-2 slides per
[KEY_POINT] - 1 slide per
[EVIDENCE]block - Section dividers are optional (skip for tight decks)
- Target: 1 slide per 45-60 seconds of speaking
3. Handle Images
Process image assessments from essay-to-speech:
| Rating | Action |
|---|---|
USE |
Include directly in slide |
ADAPT |
Note modifications needed (enlarge labels, crop, simplify) |
RECREATE |
Generate Nano Banana prompt for new visual |
SKIP |
Do not include |
For RECREATE images: Generate a Nano Banana prompt following these guidelines:
- 16:9 aspect ratio for slides
- Clear, simple compositions
- Large readable text/labels
- Brand colors if specified
Example RECREATE prompt:
"Clean horizontal bar chart comparing 5 items, minimal style,
white background, teal (#557373) bars, large bold labels,
no gridlines, presentation-ready, 16:9 aspect ratio"
4. Generate Output
Output Format: Talk Track v5
The primary output format for presentations.
Structure
---
version: 5
title: "Presentation Title"
subtitle: "Optional Subtitle"
author: "Presenter Name"
date: "2025-01-15"
target_minutes: 15
audio_voice: "af_heart"
brand:
primary: "#557373"
background: "#F2EFEA"
text: "#0D0D0D"
sections:
- id: opening
name: "Opening"
color: "#557373"
- id: problem
name: "The Problem"
color: "#6B8E6B"
- id: solution
name: "The Solution"
color: "#C4785A"
- id: closing
name: "Closing"
color: "#557373"
---
## Slides
| # | Slug | Title | Image | Section |
|---|------|-------|-------|---------|
| 1 | hook | The Question | hook.png | opening |
| 2 | problem-1 | What's Broken | problem-chart.png | problem |
| 3 | evidence | The Data | evidence.png | problem |
| 4 | solution | A New Approach | solution.png | solution |
| 5 | action | Your Next Step | cta.png | closing |
---
## [hook] The Question

<!-- AUDIO -->
[HOOK] Let me ask you something that might change how you think about this entire problem...
What if everything you believed was based on outdated assumptions?
<!-- /AUDIO -->
**Speaker Notes:**
- Pause after the question
- Make eye contact with audience
- Let the tension build
---
## [problem-1] What's Broken

<!-- AUDIO -->
[KEY_POINT] The current approach fails in three critical ways.
[EVIDENCE] First, efficiency drops by 40% when teams scale past 10 people. Second, communication overhead grows exponentially. Third, institutional knowledge gets siloed.
<!-- /AUDIO -->
**Speaker Notes:**
- Point to chart as you mention each stat
- Emphasize "exponentially"
---
Format Rules
- YAML Frontmatter - Metadata, timing, voice, sections
- Slide Index Table - Quick reference for all slides
- Individual Slides - Each with:
- H2 header:
## [slug] Title - Image reference (if applicable)
<!-- AUDIO -->block with talk track**Speaker Notes:**for presenter context
- H2 header:
Audio Block Format
Content between <!-- AUDIO --> and <!-- /AUDIO --> is:
- Read aloud by TTS engines
- Preserves semantic tags for timing hints
- Excludes speaker notes and visual descriptions
Timing Calculation
Estimate duration based on word count:
- Speaking rate: 130-150 words/minute
- Add 2-3 seconds per slide transition
- Add pause time for
[PAUSE]markers
Voice Options
Development: Kokoro TTS (via claude-speak)
- Local, free, fast iteration
- Invoke:
/claude-speakskill with audio block text - Voice:
af_heart(default) or specify in frontmatter - Output: Local audio file per slide or full presentation
Production: ElevenLabs v3
- Word-level timestamps for precise video sync
- Higher quality for final distribution
- Requires ElevenLabs API key
- Output: Audio + JSON timing data
See references/voice-options.md for full configuration.
Alternative Output Formats
HTML Slide Engine
Static HTML presentation with:
- Keyboard navigation (arrows, space)
- Speaker notes toggle (N key)
- Timer display
- Print to PDF support
See references/html-engine.md for template.
Remotion Video
Export to React-based video for:
- YouTube/social publishing
- Embedded animations
- Precise audio sync with timestamps
See references/remotion-video.md for project setup.
Workflow
Standard Flow
essay-to-speech output
↓
[slide-builder]
↓
Talk Track v5 (.md)
↓
┌─────┼─────┐
↓ ↓ ↓
HTML Video Audio
Quick Start
- Input: Provide essay-to-speech output
- Review plan: Claude proposes slide structure
- Confirm or adjust: Modify slide count, sections, visuals
- Generate: Claude outputs Talk Track v5 markdown
- Images: Generate RECREATE images via Nano Banana
- Audio: Generate voice via claude-speak or ElevenLabs
- Render: Export to HTML, video, or both
Best Practices
Slide Design Principles
- One idea per slide - Split dense content
- 6 words or less on screen - The rest is spoken
- High contrast - Readable from back row
- Consistent visual language - Same fonts, colors, style
- Images > bullet points - Visual storytelling wins
Talk Track Integration
- Audio is king - Slides support speech, not replace it
- Match pacing - Visual changes align with spoken transitions
- Build reveals - Don't show everything at once
- Breathing room - Not every sentence needs a slide change
Image Guidance
For RECREATE images, always specify:
- Aspect ratio (16:9 for slides)
- Style (clean, minimal, professional)
- Key data to visualize
- What to AVOID (clutter, small text, decorative elements)
What This Skill Does NOT Do
- Edit or create original essay content (that's essay-to-speech)
- Design custom graphics (use Nano Banana for that)
- Record actual audio (use claude-speak or ElevenLabs)
- Render final video (use Remotion or video editor)
- Create PowerPoint/Keynote files directly (exports markdown)
Integration
Upstream:
essay-to-speech- Provides structured input
Downstream:
nano-banana- Generates RECREATE imagesclaude-speak- Generates audio narrationveo3-prompter- Creates video segments (if needed)
References
references/talk-track-v5.md- Complete format specificationreferences/html-engine.md- Static HTML slide playerreferences/remotion-video.md- React video export setupreferences/voice-options.md- TTS configuration and comparisonreferences/image-handling.md- Full image processing workflowreferences/examples.md- Complete input→output examples
More from leegonzales/aiskills
veo3-prompter
Craft professional video prompts for Google Veo 3.1 using cinematic techniques, audio direction, and timestamp choreography. Use when generating AI videos, creating video prompts, or working with Veo 3.
41goals-graph
Query and update Lee's goals graph through natural language. Translates conversational questions and updates into goals_query.py commands.
19codebase-navigator
Semantic code search using osgrep for understanding codebases, finding implementations, and navigating large projects. Use when asked "where is", "how does", "find the code that", or any question about code location or implementation.
6writing-partner
Collaborative essay writing that preserves authenticity through structured interview, thread tracking, and voice calibration. Transforms AI from text generator into intellectual prosthesis. Use when writing essays, blog posts, or any content where voice matters more than speed.
4prose-polish
Evaluate and elevate writing effectiveness through multi-dimensional quality assessment. Analyzes craft, coherence, authority, purpose, and voice with genre-calibrated thresholds. Use for refining drafts, diagnosing quality issues, generating quality content, or teaching writing principles.
4claude-project-docs
Generate concise CLAUDE.md files and agent documentation following best practices. Use when setting up a new project for Claude Code, auditing existing CLAUDE.md, or creating progressive disclosure documentation structure.
3