/academic-ppt — Academic Deck Engine

Generate defense-ready .pptx presentations from academic papers with publication-quality figures.

Preamble (run first)

# Check dependencies
_MISSING=""
python3 -c "import pptx" 2>/dev/null || _MISSING="$_MISSING python-pptx"
python3 -c "import matplotlib" 2>/dev/null || _MISSING="$_MISSING matplotlib"
python3 -c "import seaborn" 2>/dev/null || _MISSING="$_MISSING seaborn"
python3 -c "import fitz" 2>/dev/null || _MISSING="$_MISSING pymupdf"
python3 -c "from PIL import Image" 2>/dev/null || _MISSING="$_MISSING Pillow"
python3 -c "import pygments" 2>/dev/null || _MISSING="$_MISSING Pygments"
_GEMINI=""
python3 -c "import google.genai" 2>/dev/null && _GEMINI="available" || _GEMINI="unavailable"
_MERMAID=""
which mmdc 2>/dev/null && _MERMAID="available" || _MERMAID="unavailable"
echo "MISSING: ${_MISSING:-none}"
echo "GEMINI_SDK: $_GEMINI"
echo "MERMAID_CLI: $_MERMAID"

If MISSING is not "none": install missing packages:

pip install python-pptx matplotlib seaborn pymupdf Pillow Pygments

If GEMINI_SDK is "unavailable": concept diagrams will use Mermaid fallback (or be skipped). If MERMAID_CLI is "unavailable" AND no Gemini: concept diagrams will be skipped entirely. Inform user.

Input Detection

Parse the user's command to determine the input source and options.

Supported inputs (in priority order):

LaTeX source (--latex path/to/main.tex or auto-detected .tex file) — PRIMARY input, highest quality
Paper PDF (path/to/paper.pdf or auto-detected) — Best-effort with warnings
Code repo (--repo . or --repo path/) — Supplementary data extraction
Text description (quoted string) — Direct content, no extraction needed

Options:

Flag	Default	Values
`--scene`	`defense`	defense, conference, seminar
`--time`	`15min` (defense), `12min` (conference), `30min` (seminar)
`--lang`	`en`	en, zh, zh-en (bilingual)
`--output`	`./output.pptx`	Output file path
`--no-gemini`	false	Force Matplotlib-only mode

If no flags provided, use smart defaults: detect input type from first argument, scene=defense, auto-detect language.

Phase 1: Content Extraction

LaTeX Source (Primary Path)

Resolve \input{} and \include{} directives — Read the main .tex file, find all \input{...} and \include{...} directives, read referenced files, and inline them. Handle nested includes up to 3 levels. Warn if a referenced file is missing.

import re, os

def resolve_latex_includes(main_tex_path: str, depth: int = 0) -> str:
    """Resolve \input{} and \include{} up to 3 levels deep."""
    if depth > 3:
        return ""
    base_dir = os.path.dirname(os.path.abspath(main_tex_path))
    with open(main_tex_path, 'r', encoding='utf-8', errors='replace') as f:
        content = f.read()

    def replace_include(match):
        filename = match.group(1)
        if not filename.endswith('.tex'):
            filename += '.tex'
        filepath = os.path.join(base_dir, filename)
        if os.path.exists(filepath):
            return resolve_latex_includes(filepath, depth + 1)
        else:
            return f"% WARNING: Could not find {filename}"

    content = re.sub(r'\\input\{([^}]+)\}', replace_include, content)
    content = re.sub(r'\\include\{([^}]+)\}', replace_include, content)
    return content

Extract sections — Parse the resolved LaTeX for \section{}, \subsection{}, \chapter{}, \begin{abstract}, \title{}, \author{}. Extract text content between section markers.
LLM Extraction Validation — After extraction, verify the content is coherent. Check:
- Total word count (expect >500 for a paper, >2000 for a thesis)
- Number of sections found (expect >3)
- If validation fails: warn user and ask if they want to proceed or provide manual input

PDF Source (Best-Effort Path)

import fitz  # PyMuPDF

def extract_pdf_sections(pdf_path: str) -> dict:
    doc = fitz.open(pdf_path)
    if doc.is_encrypted:
        raise ValueError("PDF is encrypted. Please provide the password or use LaTeX source instead.")

    full_text = ""
    for page in doc:
        full_text += page.get_text()

    if len(full_text.strip()) < 100:
        raise ValueError("PDF appears to be scanned (no text layer). Please use LaTeX source or OCR the PDF first.")

    return full_text

IMPORTANT: When using PDF input, always display this warning:

"PDF extraction is best-effort. Two-column layouts, math notation, and tables may not extract correctly. For highest quality, use LaTeX source (--latex main.tex)."

Code Repo (Supplementary)

If --repo is provided, scan for:

results/ or output/ directories → JSON/CSV files with experiment data
src/ entry point (main.py, app.py) → extract docstrings for architecture summary
figures/ or plots/ → existing PNG/PDF figures to embed directly

If no recognizable structure found: skip repo integration and inform user.

Text Description

If the input is a quoted string with no file path: use it directly as the content source. Claude generates slide content from the description.

Phase 2: Content Structuring

Read content_guidelines.md for the full Pyramid Principle framework and slide design rules.

Scene Template Selection

# Import the appropriate template
from templates.defense import DefenseTemplate
from templates.conference import ConferenceTemplate
from templates.seminar import SeminarTemplate

TEMPLATES = {
    "defense": DefenseTemplate,
    "conference": ConferenceTemplate,
    "seminar": SeminarTemplate,
}

template = TEMPLATES[scene]()
slide_budget = template.get_slide_budget(time_minutes)
structure = template.get_structure()
style = template.get_style()

Slide Content Generation

For each slide in the structure:

Map extracted sections to slide slots (e.g., abstract → Motivation slide, results → Results slides)
Generate action titles — complete sentences stating the takeaway, NOT topic labels
- GOOD: "UTG-augmented agents achieve 50% task completion vs 25.6% baseline"
- BAD: "Experimental Results"
Generate speaker notes for each slide
Identify which slides need figures (mark with figure type: chart, diagram, code, table)
Generate predicted Q&A questions for the Q&A preparation slide

Ghost Deck Test

After generating all titles, read them in sequence. They should tell the complete narrative of the paper. If the sequence doesn't flow, revise titles.

Phase 3: Figure Generation

Read figure_patterns.md for code patterns and engine selection logic.

Figure Decision Matrix

Data Type	Engine	Output
Bar/line/scatter/heatmap	Matplotlib/Seaborn	PNG (300 DPI)
Confusion matrices	Matplotlib/Seaborn	PNG (300 DPI)
Architecture diagrams	Gemini image gen (or Mermaid fallback)	PNG
Method flowcharts	Gemini image gen (or Mermaid fallback)	PNG
Code snippets	Pygments → PIL → PNG	PNG
Tables	python-pptx native table	In-slide

Engine 1: Matplotlib (Data Charts)

For each chart-type figure:

Generate Python code that creates the chart using matplotlib/seaborn

Import whitelist check — before execution, verify only allowed imports:

ALLOWED_IMPORTS = {'matplotlib', 'seaborn', 'numpy', 'pandas', 'math'}

def validate_imports(code: str) -> bool:
    import ast
    tree = ast.parse(code)
    for node in ast.walk(tree):
        if isinstance(node, ast.Import):
            for alias in node.names:
                if alias.name.split('.')[0] not in ALLOWED_IMPORTS:
                    return False
        elif isinstance(node, ast.ImportFrom):
            if node.module and node.module.split('.')[0] not in ALLOWED_IMPORTS:
                return False
    return True

If validation fails: regenerate with explicit instruction to use only allowed imports
Execute the code in a subprocess with 30s timeout
If execution fails: retry once with simplified chart, skip figure if still fails
Output: PNG at 300 DPI, colorblind-friendly palette (use seaborn.color_palette("colorblind"))

Parallel execution: Generate all Matplotlib code first, then execute all scripts in parallel using subprocess (or concurrent.futures.ProcessPoolExecutor). If one fails, others still succeed.

Engine 2: Gemini Image Generation (Concept Diagrams)

Only available if google-genai SDK is installed AND GOOGLE_GENAI_API_KEY env var is set.

from google import genai

client = genai.Client(api_key=os.environ.get("GOOGLE_GENAI_API_KEY"))

def generate_concept_diagram(prompt: str) -> bytes | None:
    """Generate a concept diagram using Nano Banana 2 (Gemini 3.1 Flash Image)."""
    try:
        response = client.models.generate_content(
            model="gemini-3.1-flash-image-preview",
            contents=prompt,
            config=genai.types.GenerateContentConfig(
                response_modalities=["Image", "Text"],
                image_config=genai.types.ImageConfig(
                    aspect_ratio="16:9",
                    image_size="1K",
                ),
            ),
        )
        for part in response.candidates[0].content.parts:
            if part.inline_data and part.inline_data.mime_type.startswith("image/"):
                return part.inline_data.data
        return None
    except Exception as e:
        print(f"Gemini error: {e}")
        return None

Rate limiting: Call sequentially with 2s delay between calls. On 429 error: exponential backoff (30s, 60s). On content policy rejection: fall back to Mermaid for that specific diagram.

Mermaid Fallback

If Gemini is unavailable or fails, and mmdc CLI is installed:

echo "$MERMAID_CODE" > /tmp/diagram.mmd
mmdc -i /tmp/diagram.mmd -o /tmp/diagram.png -w 1200 -H 800

If mmdc is not installed: skip concept diagrams entirely. Inform user.

Code Snippet Figures (Pygments)

from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import ImageFormatter

def code_to_png(code: str, output_path: str, language: str = "python"):
    lexer = PythonLexer()  # or get_lexer_by_name(language)
    formatter = ImageFormatter(
        font_size=14,
        line_numbers=False,
        style="monokai",
        image_pad=20,
    )
    with open(output_path, 'wb') as f:
        f.write(highlight(code, lexer, formatter))

Phase 4: PPTX Assembly

Read templates/base_style.py for the BaseTemplate interface and shared styling.

Assembly Steps

Create a new Presentation object with the template's slide dimensions
For each slide in the structure: a. Add slide with the template's layout b. Set action title c. Insert content (text, bullets, figures, tables) d. Add speaker notes e. Apply the template's styling (fonts, colors, spacing)
Add title slide (first) and Q&A slide (last)
Embed all figures at 300 DPI
Save as .pptx

Structural QA (post-assembly)

After generating the .pptx, run structural checks via python-pptx object model:

def structural_qa(prs) -> list[str]:
    """Check for structural issues in the presentation."""
    warnings = []
    for i, slide in enumerate(prs.slides):
        shapes = list(slide.shapes)
        # Check text overflow
        for shape in shapes:
            if shape.has_text_frame:
                for para in shape.text_frame.paragraphs:
                    text_len = sum(len(run.text) for run in para.runs)
                    if text_len > 500:
                        warnings.append(f"Slide {i+1}: Text may overflow ({text_len} chars)")
        # Check figure overlap
        for j, s1 in enumerate(shapes):
            for s2 in shapes[j+1:]:
                if boxes_overlap(s1, s2):
                    warnings.append(f"Slide {i+1}: Shapes may overlap")
    # Check slide count vs time budget
    slide_count = len(prs.slides)
    if slide_count > expected_max:
        warnings.append(f"Too many slides ({slide_count}) for {time_minutes}min talk")
    return warnings

Auto-fix where possible (resize text boxes, reposition overlapping shapes). Display remaining warnings to user.

Content Cross-Reference Verification

After assembly, verify key claims on slides match the source material:

Extract all numbers/percentages from slide content
Compare against numbers found in the source text
Flag any number on a slide that doesn't appear in the source (potential hallucination)
Display flagged items to user for manual verification

Phase 5: Output & Progress

Progress Reporting

Throughout the pipeline, display status messages:

[1/6] Extracting content from main.tex...
[2/6] Structuring slides (defense, 15 slides)...
[3/6] Generating data figures (4 charts)...
[4/6] Generating concept diagrams (3 diagrams)...
[5/6] Assembling presentation...
[6/6] Running quality checks...

✓ Output: ./defense_presentation.pptx (15 slides, 7 figures)
⚠ 2 warnings: [list warnings]

Save Artifacts

Save alongside the .pptx:

{name}_figures/ — All generated figure PNGs (for manual editing)
{name}_matplotlib_code/ — Generated Matplotlib scripts (for tweaking)
{name}_log.txt — Run log (which figures succeeded/failed, fallbacks triggered)

Error Handling Summary

Error	Detection	Action
Encrypted PDF	`doc.is_encrypted`	Raise error, suggest LaTeX input
Scanned PDF (no text)	`len(text) < 100`	Raise error, suggest LaTeX or OCR
Two-column garbled	LLM validation	Warn user, continue with best-effort
Missing \input{} file	`os.path.exists()` check	Warn, skip that include
Circular \input{}	Depth > 3	Stop recursion, warn
Matplotlib import violation	AST whitelist check	Regenerate code
Matplotlib execution fail	subprocess timeout/error	Retry once simplified, skip
Gemini timeout	30s timeout per call	Retry once, Mermaid fallback
Gemini rate limit 429	HTTP status	Backoff 30s/60s, Mermaid fallback
Gemini content policy	API response	Mermaid fallback for that diagram
Shape overflow	Bounding box math	Auto-resize text box
Font not available	N/A at gen time	Use Arial (universally available)
.pptx too large (>50MB)	File size check	Warn user, suggest reducing figures
Wrong numbers on slides	Cross-reference check	Flag to user for manual review