skills/luwill/research-skills/paper-slide-deck

paper-slide-deck

SKILL.md

Paper Slide Deck Generator

Transform academic papers and content into professional slide deck images with automatic figure extraction.

Usage

/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck  # Then paste content

Script Directory

Important: All scripts are located in the scripts/ subdirectory of this skill.

Agent Execution Instructions:

  1. Determine this SKILL.md file's directory path as SKILL_DIR
  2. Script path = ${SKILL_DIR}/scripts/<script-name>.ts
  3. Replace all ${SKILL_DIR} in this document with the actual path

Script Reference:

Script Purpose
scripts/generate-slides.py Generate AI slides via Gemini API (Python)
scripts/merge-to-pptx.ts Merge slides into PowerPoint
scripts/merge-to-pdf.ts Merge slides into PDF
scripts/detect-figures.ts Auto-detect figures/tables in PDF
scripts/extract-figure.ts Extract figure from PDF page (uses PyMuPDF fallback)
scripts/apply-template.ts Apply figure container template

Options

Option Description
--style <name> Visual style (see Style Gallery)
--audience <type> Target audience: beginners, intermediate, experts, executives, general
--lang <code> Output language (en, zh, ja, etc.)
--slides <number> Target slide count
--outline-only Generate outline only, skip image generation

Style Gallery

Style Description Best For
academic-paper Clean professional, precise charts Conference talks, thesis defense
blueprint (Default) Technical schematics, grid texture Architecture, system design
chalkboard Black chalkboard, colorful chalk Education, tutorials, classroom
notion SaaS dashboard, card-based layouts Product demos, SaaS, B2B
bold-editorial Magazine cover, bold typography, dark Product launches, keynotes
corporate Navy/gold, structured layouts Investor decks, proposals
dark-atmospheric Cinematic dark mode, glowing accents Entertainment, gaming
editorial-infographic Magazine explainers, flat illustrations Tech explainers, research
fantasy-animation Ghibli/Disney style, hand-drawn Educational, storytelling
intuition-machine Technical briefing, bilingual labels Technical docs, academic
minimal Ultra-clean, maximum whitespace Executive briefings, premium
pixel-art Retro 8-bit, chunky pixels Gaming, developer talks
scientific Academic diagrams, precise labeling Biology, chemistry, medical
sketch-notes Hand-drawn, warm & friendly Educational, tutorials
vector-illustration Flat vector, retro & cute Creative, children's content
vintage Aged-paper, historical styling Historical, heritage, biography
watercolor Hand-painted textures, natural warmth Lifestyle, wellness, travel

Auto Style Selection

Content Signals Selected Style
paper, thesis, defense, conference, ieee, acm, icml, neurips, cvpr, acl, aaai, iclr academic-paper
tutorial, learn, education, guide, intro, beginner sketch-notes
classroom, teaching, school, chalkboard, blackboard chalkboard
architecture, system, data, analysis, technical blueprint
creative, children, kids, cute, illustration vector-illustration
briefing, bilingual, infographic, concept intuition-machine
executive, minimal, clean, simple, elegant minimal
saas, product, dashboard, metrics, productivity notion
investor, quarterly, business, corporate, proposal corporate
launch, marketing, keynote, bold, impact, magazine bold-editorial
entertainment, music, gaming, creative, atmospheric dark-atmospheric
explainer, journalism, science communication editorial-infographic
story, fantasy, animation, magical, whimsical fantasy-animation
gaming, retro, pixel, developer, nostalgia pixel-art
biology, chemistry, medical, pathway, scientific scientific
history, heritage, vintage, expedition, historical vintage
lifestyle, wellness, travel, artistic, natural watercolor
Default blueprint

Layout Gallery

Optional layout hints for individual slides. Specify in outline's // LAYOUT section.

Slide-Specific Layouts

Layout Description Best For
title-hero Large centered title + subtitle Cover slides, section breaks
quote-callout Featured quote with attribution Testimonials, key insights
key-stat Single large number as focal point Impact statistics, metrics
split-screen Half image, half text Feature highlights, comparisons
icon-grid Grid of icons with labels Features, capabilities, benefits
two-columns Content in balanced columns Paired information, dual points
three-columns Content in three columns Triple comparisons, categories
image-caption Full-bleed image + text overlay Visual storytelling, emotional
agenda Numbered list with highlights Session overview, roadmap
bullet-list Structured bullet points Simple content, lists

Infographic-Derived Layouts

Layout Description Best For
linear-progression Sequential flow left-to-right Timelines, step-by-step
binary-comparison Side-by-side A vs B Before/after, pros-cons
comparison-matrix Multi-factor grid Feature comparisons
hierarchical-layers Pyramid or stacked levels Priority, importance
hub-spoke Central node with radiating items Concept maps, ecosystems
bento-grid Varied-size tiles Overview, summary
funnel Narrowing stages Conversion, filtering
dashboard Metrics with charts/numbers KPIs, data display
venn-diagram Overlapping circles Relationships, intersections
circular-flow Continuous cycle Recurring processes
winding-roadmap Curved path with milestones Journey, timeline
tree-branching Parent-child hierarchy Org charts, taxonomies
iceberg Visible vs hidden layers Surface vs depth
bridge Gap with connection Problem-solution

Academic-Specific Layouts

Layout Description Best For
paper-title Title, authors, affiliations, venue Conference paper cover
outline-agenda Numbered section list with highlights Talk structure overview
methods-diagram Central architecture/pipeline diagram Methods, system design
results-chart Chart area + data annotations Quantitative results
equation-focus Centered equation + variable definitions Mathematical derivations
qualitative-grid 2x2 or 3x2 image comparison grid Visual results, ablations
references-list Numbered citation list Key references slide
contributions Numbered contribution points Contributions summary

Usage: Add Layout: <name> in slide's // LAYOUT section to guide visual composition.

Design Philosophy

This deck is designed for reading and sharing, not live presentation:

  • Each slide must be self-explanatory without verbal commentary
  • Structure content for logical flow when scrolling
  • Include all necessary context within each slide
  • Optimize for social media sharing and offline reading

File Management

Output Directory

Each session creates an independent directory named by content slug:

slide-deck/{topic-slug}/
├── source-{slug}.{ext}    # Source files (text, images, etc.)
├── outline.md
├── outline-{style}.md     # Style variant outlines
├── prompts/
│   └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdf

Slug Generation:

  1. Extract main topic from content (2-4 words, kebab-case)
  2. Example: "Introduction to Machine Learning" → intro-machine-learning

Conflict Resolution

If slide-deck/{topic-slug}/ already exists:

  • Append timestamp: {topic-slug}-YYYYMMDD-HHMMSS
  • Example: intro-ml exists → intro-ml-20260118-143052

Source Files

Copy all sources with naming source-{slug}.{ext}:

  • source-article.md (main text content)
  • source-diagram.png (image from conversation)
  • source-data.xlsx (additional file)

Multiple sources supported: text, images, files from conversation.

Workflow

Step 1: Analyze Content

  1. Save source content (if pasted, save as source.md)
  2. Follow references/analysis-framework.md for deep content analysis
  3. Determine style (use --style or auto-select from signals)
  4. Detect languages (source vs. user preference)
  5. Plan slide count (--slides or dynamic)
  6. For academic papers (PDF with figures): Run automatic figure detection:
    npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
    
    This outputs a JSON file with all detected figures/tables, their page numbers, and captions.

Step 2: Generate Outline Variants

  1. Generate 3 style variant outlines based on content analysis
  2. Follow references/outline-template.md for structure
  3. Auto-populate IMAGE_SOURCE for academic papers:
    • Read figures.json from Step 1
    • Map figures to slides using rules in references/analysis-framework.md Section 8
    • Automatically add // IMAGE_SOURCE blocks to appropriate slides:
      • Architecture/pipeline figures → Methods slides (Source: extract)
      • Results tables → Quantitative results slides (Source: extract)
      • Comparison images → Qualitative results slides (Source: extract)
      • Conceptual/simple diagrams → Leave for AI generation (Source: generate or omit)
  4. Save as outline-{style}.md for each variant

Step 3: User Confirmation

Single AskUserQuestion with all applicable options:

Question When to Ask
Style variant Always (3 options + custom)
Language Only if source ≠ user language

After selection:

  • Copy selected outline-{style}.md to outline.md
  • Regenerate in different language if requested
  • User may edit outline.md for fine-tuning

If --outline-only, stop here.

Step 4: Generate Prompts

  1. Read references/base-prompt.md
  2. Combine with style instructions from outline
  3. Add slide-specific content
  4. If Layout: specified in outline, include layout guidance in prompt:
    • Reference layout characteristics for image composition
    • Example: Layout: hub-spoke → "Central concept in middle with related items radiating outward"
  5. Save to prompts/ directory

Step 5: Image Generation Method Selection

Before generating images, ask user to choose generation method:

Use AskUserQuestion with options:

Option Label Description
1 Gemini API (Recommended) Official Google API via Python. Requires GOOGLE_API_KEY env var.
2 Gemini Web (Browser-based) ⚠️ Uses reverse-engineered web API. No API key needed but may break.

Based on selection:

Option 1: Gemini API (Python)

  1. Verify API key: Check GOOGLE_API_KEY or GEMINI_API_KEY environment variable
  2. Run generation script:
    python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview
    

Script Features:

  • Auto-installs google-genai package if missing
  • Retry logic with exponential backoff (3 retries)
  • Skips already-generated slides (> 10KB)
  • Supports custom model via --model flag
  • Outputs to slides/ subdirectory

Troubleshooting:

  • If server disconnection errors occur, script auto-retries
  • For persistent failures, re-run the script (it skips completed slides)
  • Check API quota if many failures occur

Option 2: Gemini Web Skill

  1. Consent Check: Read consent file at:

    • Windows: $APPDATA/baoyu-skills/gemini-web/consent.json
    • macOS: ~/Library/Application Support/baoyu-skills/gemini-web/consent.json
    • Linux: ~/.local/share/baoyu-skills/gemini-web/consent.json
  2. If no consent or version mismatch, display disclaimer and ask:

    ⚠️ DISCLAIMER: This uses a reverse-engineered Gemini Web API (NOT official).
    Risks: May break anytime, no support, possible account risk.
    
  3. For each slide, run:

    npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \
      --promptfiles prompts/01-slide-cover.md \
      --image 01-slide-cover.png \
      --sessionId slides-{topic-slug}-{timestamp}
    

    Where GEMINI_WEB_SKILL_DIR = path to baoyu-danger-gemini-web skill directory.

  4. Proxy support: If user is in restricted network, prepend:

    HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890
    

Step 5.5: Process IMAGE_SOURCE (Automatic Figure Extraction)

For academic presentations, IMAGE_SOURCE metadata was auto-populated in Step 2 based on figure detection from Step 1.

Automatic Execution:

  1. Parse outline to identify slides with Source: extract

  2. Create figures directory: mkdir -p figures

  3. For each extract slide, automatically:

    • Read the Figure number, Page, and Caption from metadata
    • Run figure extraction script:
      npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \
        --pdf source-paper.pdf \
        --page <page-number> \
        --output figures/figure-<N>.png
      
    • Run template application script:
      npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \
        --figure figures/figure-<N>.png \
        --title "<slide-headline>" \
        --caption "Figure <N>: <caption-text>" \
        --output <NN>-slide-<slug>.png
      
    • Report: "Extracted: Figure N → slide NN"
  4. For slides with Source: generate (or no IMAGE_SOURCE):

    • Proceed to Step 6 for AI generation

Note: Source PDF must be saved as source-paper.pdf in output directory.

Troubleshooting:

  • If figure detection missed a figure: manually add // IMAGE_SOURCE block to outline
  • If wrong figure mapped: edit the Figure: and Page: values in outline
  • If extraction fails: check PDF page number (1-indexed)

PyMuPDF Fallback for Page Extraction: If extract-figure.ts fails with "Image or Canvas expected" error (common with complex PDFs), use PyMuPDF:

import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1]  # 0-indexed
mat = fitz.Matrix(3, 3)  # 3x scale for high resolution
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")

Then apply template using apply-template.ts.

Step 6: Generate Images

  1. Use selected method from Step 5
  2. Skip slides already processed in Step 5.5 (those with Source: extract)
  3. Generate session ID: slides-{topic-slug}-{timestamp}
  4. Generate each remaining slide with same session ID
  5. Report progress: "Generated X/N"
  6. Auto-retry once on generation failure

Step 7: Merge to PPTX and PDF

npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>

Step 8: Output Summary

Slide Deck Complete!

Topic: [topic]
Style: [style name]
Location: [directory path]
Slides: N total

- 01-slide-cover.png ✓ Cover
- 02-slide-intro.png ✓ Content
- ...
- {NN}-slide-back-cover.png ✓ Back Cover

Outline: outline.md
PPTX: {topic-slug}.pptx
PDF: {topic-slug}.pdf

Slide Modification

See references/modification-guide.md for:

  • Edit single slide workflow
  • Add new slide (with renumbering)
  • Delete slide (with renumbering)
  • File naming conventions

Image Generation Dependencies

Gemini API (Option 1 - Recommended)

Requires:

  • GOOGLE_API_KEY or GEMINI_API_KEY environment variable
  • Python 3.8+ with pip
  • google-genai package (auto-installed by script)

Model: gemini-3-pro-image-preview (default)

Gemini Web Skill (Option 2)

Requires:

  • baoyu-danger-gemini-web skill installed at .claude/skills/baoyu-danger-gemini-web
  • Google Chrome browser with logged-in Google account
  • User consent for reverse-engineered API disclaimer

PDF Figure Extraction

Requires:

  • Primary: pdfjs-dist npm package (use legacy build for Node.js)
  • Fallback: pymupdf Python package (more reliable for complex PDFs)
  • canvas npm package for apply-template.ts

References

File Content
references/analysis-framework.md Deep content analysis for presentations
references/outline-template.md Outline structure and STYLE_INSTRUCTIONS format
references/modification-guide.md Edit, add, delete slide workflows
references/content-rules.md Content and style guidelines
references/base-prompt.md Base prompt for image generation
references/figure-container-template.md Visual specs for extracted figure containers
references/styles/<style>.md Full style specifications

Notes

Image Generation

  • Nano Banana Pro API: Recommended. Stable, reliable, requires API key
  • Gemini Web: No API key needed, but uses reverse-engineered API with account risk
  • Generation time: 10-30 seconds per slide
  • Auto-retry once on generation failure
  • Maintain style consistency via session ID

Content Guidelines

  • Use stylized alternatives for sensitive public figures
  • Both methods use the same underlying Gemini model for image generation

Extension Support

Custom styles and configurations via EXTEND.md.

Check paths (priority order):

  1. .paper-skills/paper-slide-deck/EXTEND.md (project)
  2. ~/.paper-skills/paper-slide-deck/EXTEND.md (user)

If found, load before Step 1. Extension content overrides defaults.

Weekly Installs
57
GitHub Stars
276
First Seen
Jan 25, 2026
Installed on
opencode50
cursor48
gemini-cli47
codex47
github-copilot45
cline41