docs-pptx
PPTX to Markdown Conversion
Convert PowerPoint presentations to markdown format with inline slide images and AI-generated alt-text descriptions. The conversion produces a complete markdown document suitable for documentation, archival, version control, or AI analysis.
When to Use This Skill
Use this skill for PowerPoint presentations that need markdown conversion for:
- Documentation and archival of presentation content
- Making presentations accessible to AI analysis workflows
- Creating searchable, version-controllable presentation archives
- Generating accessible presentations with detailed alt-text
- Preserving presentation content in a portable text format
Conversion Workflow
The conversion has two main phases:
Deterministic (scripted):
- PPTX → PDF conversion using LibreOffice
- PDF → JPEG slide images using poppler-utils
- Text extraction using markitdown
- Markdown assembly with inline images
Non-deterministic (AI-assisted): 5. Alt-text generation using multimodal image understanding
Step 1: Run the Conversion Script
Execute the provided Python script to handle all deterministic conversion steps:
./scripts/convert_pptx_to_markdown.py <input.pptx> <output_dir>
Example:
./scripts/convert_pptx_to_markdown.py source/presentation.pptx output/
Optional flags:
--dpi 150- Image resolution (default: 150, higher=better quality/larger files)--title "Custom Title"- Override auto-detected presentation title
Output structure:
output/
├── presentation.md # Final markdown with inline images
├── presentation-text.md # Extracted text (intermediate)
├── pdf/
│ └── presentation.pdf # Generated PDF (intermediate)
└── slide-images/
└── slide-01.jpg ... slide-N.jpg # Slide images at specified DPI
Requirements:
- LibreOffice (soffice command)
- poppler-utils (pdftoppm command)
- Python 3.11+ with uv
See references/troubleshooting.md for installation instructions for your operating system.
Progress output: The script shows progress for each step and reports any errors with clear messages.
Step 2: Generate Alt-Text Descriptions
After the script completes, create detailed image descriptions for accessibility:
- Read all slide images in
output/slide-images/ - Analyze each slide's visual content:
- Chart types and structure (bars, lines, tables, dual-panels)
- Color schemes and what they represent
- Key data points and trends
- Layout and visual hierarchy
- Business insights from visualizations
- Write comprehensive descriptions to
output/alt-texts.md
Format for alt-texts.md:
# Alt-Text Descriptions
## Slide 1
[Detailed description of title slide, branding, layout]
## Slide 2
[Description of chart type, colors, data structure, insights]
...
Alt-text best practices:
- Describe chart types explicitly (stacked bars, line graphs, tables)
- Explain color coding and what colors represent
- Note key data points and trends visible
- Include business context where relevant
- Make descriptions useful for both screen readers and AI analysis
Step 3: Validate Output
Verify conversion succeeded:
- All slides converted to images (count matches slide count in presentation)
- Final markdown created with inline image references
- Images display correctly when viewing markdown
- Alt-texts.md created with descriptions for all slides
- No error messages in script output
Quick validation:
# Count slides in output
ls output/slide-images/slide-*.jpg | wc -l
# View the final markdown
cat output/presentation.md
Troubleshooting
If the conversion script fails, see references/troubleshooting.md for solutions to common issues:
- LibreOffice not found or conversion errors
- Image export failures with poppler-utils
- System library errors with markitdown
- Missing slide markers in extracted text
- Broken image links in markdown
The troubleshooting reference includes detailed solutions, alternative approaches, and debugging strategies.
Integration with CSF Workflows
This skill is particularly useful for CSF presentation processing:
- Convert Stage 7 (Presentation) deliverables to markdown for archival
- Generate alt-texts that capture chart types and business metrics
- Enable searching across presentations for specific analyses
- Version control presentation content alongside code and data
- Make analytical insights accessible to AI workflows
When working with CSF presentations, pay attention to:
- Color schemes (typically red/orange/blue for Price/CSF/MCV)
- Dual-panel comparison layouts (channel vs channel analysis)
- Stacked bar charts showing component breakdown
- Performance tables and metric comparisons
Example
See examples/uk_bbq_presentation/ for a complete reference conversion:
- Source: 15-slide CSF study presentation
- Output: Markdown with inline images and detailed alt-texts
- Demonstrates: Business analytics charts, dual-panel layouts, color-coded metrics
View the example to understand expected output format and alt-text patterns.
Best Practices
- Preserve originals: Never modify source files in shared folders; work with copies
- Validate immediately: Check each conversion before moving to alt-text generation
- Descriptive alt-texts: Make descriptions valuable for both accessibility and AI understanding
- Consistent DPI: Use 150 DPI for good quality/size balance (adjust if needed)
- Version control: Commit markdown outputs to track presentation changes
- Document process: Note any special handling needed for specific presentations
Key Files
- scripts/convert_pptx_to_markdown.py - Main conversion script (Python with uv)
- references/troubleshooting.md - Common issues and solutions
- examples/uk_bbq_presentation/ - Complete reference example
Quick Start
# 1. Run conversion script
./scripts/convert_pptx_to_markdown.py source/presentation.pptx output/
# 2. Generate alt-texts (AI-assisted)
# Read slide images and write detailed descriptions to output/alt-texts.md
# 3. Validate results
ls output/slide-images/ | wc -l # Should match slide count
cat output/presentation.md # View final markdown
The script handles all deterministic steps with proper error handling. Focus on generating high-quality alt-text descriptions that preserve the analytical and visual context of the presentation.
More from nikhilmaddirala/gtd-cc
tools-catppuccin
Agent skill for creating and validating Catppuccin theme ports
18obsidian-gtd
Obsidian vault management and GTD workflows. Use when integrating with Obsidian vaults, managing notes, organizing knowledge, or supporting Getting Things Done methodology through note-based workflows.
13web-search
General web search patterns and techniques including Gemini CLI coordination. Use this skill when you need to perform web searches, find current information, or research topics online. Covers both Gemini CLI and built-in WebSearch tool usage with precise instruction crafting.
11tools-diagnostics
Interactive system resource analysis and troubleshooting for memory, disk, CPU, and performance issues
11web-content-extraction
Extract documentation and content from websites. Supports Mintlify, Starlight/Astro, Docusaurus, GitBook, ReadTheDocs, Sphinx, and generic sites. Uses a tiered approach - try the simplest method first (direct curl, Jina AI Reader) before falling back to Crawl4AI for JS-heavy sites.
10docs-pdf
Parse PDF documents into repository-friendly markdown and text artifacts. Use when users need to extract text, tables, or structure from PDF files.
10