pandoc-convert
📄 Pandoc Convert (Integrated)
Universal document converter combining unified Python tools with modular bash utilities.
The pandoc-convert skill provides intelligent workflows for converting documents between 40+ formats using pandoc. This integrated version combines:
- Unified Python converter (convert.py) - Single powerful tool for most conversions
- Modular bash utilities (batch_convert.sh, validate.sh) - Specialized workflows
- Comprehensive templates - Both LaTeX academic and modern CSS styles
- Professional documentation - Complete guides, troubleshooting, and references
✨ Key Features
- 40+ Format Support: Markdown, Word, PDF, HTML, LaTeX, EPUB, RST, AsciiDoc, Org-mode, and more
- Dual Toolset: Python for smart conversions + bash for validation/batch processing
- Professional Templates: 12 templates covering academic, business, and web use cases
- Comprehensive Documentation: Format guides, troubleshooting, templates, and quick reference
- Smart Defaults: Optimized settings for each conversion path
- Metadata Preservation: Keep titles, authors, dates across formats
- Error Recovery: Validation and helpful error messages
🔧 Prerequisites
Required
- pandoc (v2.19+ recommended)
- Python 3.8+ (for convert.py helper)
Optional (for extended formats)
- LaTeX (TeX Live, MiKTeX) - Required for PDF output
- wkhtmltopdf - Alternative HTML to PDF converter
- librsvg - SVG support
- epubcheck - EPUB validation
See INSTALL.md for detailed installation instructions per platform.
📚 Quick Start
Using Python Helper (Recommended)
# Single file conversion
python scripts/convert.py input.md output.pdf
# With custom template
python scripts/convert.py report.md report.pdf --template business --toc
# Batch convert
python scripts/convert.py --batch *.md --format pdf --output-dir ./pdfs
Using Bash Utilities
# Batch convert with validation
./scripts/batch_convert.sh input/*.md pdf output/
# Validate output
./scripts/validate.sh output/document.pdf
./scripts/validate.sh output/book.epub
Direct Pandoc
# Markdown → PDF
pandoc input.md -o output.pdf
# Markdown → Word
pandoc input.md -o output.docx
# Word → Markdown
pandoc input.docx -o output.md --extract-media=./media
🎯 Common Workflows
See references/conversion-guides.md for detailed step-by-step guides:
- Markdown → Professional PDF (business reports, academic papers)
- Word → Markdown (version control friendly)
- Markdown → EPUB (eBooks with validation)
- Multi-file → Single PDF (book compilation)
- Markdown → HTML5 (standalone with CSS)
🎨 Templates
LaTeX Templates (Academic/Professional)
academic-paper.tex- Manuscript stylebusiness-letter.tex- Professional correspondencetechnical-report.tex- Technical documentationresume.tex- CV/resume formattingprofessional.tex- General-purpose professionalreport-template.tex- Report structure
CSS Templates (Web/Modern)
github.css- GitHub markdown styleblog-style.css- Clean blog formatepub-style.css- eBook stylingpresentation.html- HTML presentationsebook.css- Enhanced eBook layout
Reference Documents
reference-styles.docx- Word style reference
All templates in templates/ directory.
🔧 Tool Reference
convert.py (Python)
Unified conversion tool with smart defaults:
python scripts/convert.py [OPTIONS] INPUT OUTPUT
Options:
--format FORMAT Force output format
--template TEMPLATE Use named template
--toc Include table of contents
--number-sections Number headings
--title TITLE Document title
--author AUTHOR Document author
--batch Batch mode
--validate Validate output
--verbose Detailed output
batch_convert.sh (Bash)
Batch processing with progress tracking:
./scripts/batch_convert.sh INPUT_DIR FORMAT OUTPUT_DIR [OPTIONS]
# Example
./scripts/batch_convert.sh ./docs/ pdf ./output/ --toc --number-sections
validate.sh (Bash)
Post-conversion validation:
./scripts/validate.sh FILE
# Validates:
# - PDF structure and readability
# - EPUB spec compliance (requires epubcheck)
# - HTML validity
# - File integrity
📊 Format Support
Input Formats
Markdown: markdown, gfm, markdown_mmd
Word: docx, odt, rtf
Web: html, html5
LaTeX: latex, tex
Plain Text: txt, rst, textile, asciidoc
Academic: jats, docbook
Presentation: pptx
eBooks: epub
Other: json, csv, org, mediawiki, man
Output Formats
All input formats plus: PDF, EPUB, RevealJS, Beamer
Complete format matrix: references/format-matrix.md
🗂️ Directory Structure
pandoc-convert-integrated/
├── SKILL.md # This file
├── INSTALL.md # Detailed installation guide
├── README.md # Quick start guide
├── scripts/
│ ├── convert.py # Unified Python converter
│ ├── batch_convert.sh # Bash batch processor
│ └── validate.sh # Validation utility
├── templates/
│ ├── *.tex # LaTeX templates (6)
│ ├── *.css # CSS templates (3)
│ ├── *.html # HTML templates (1)
│ └── *.docx # Word reference (1)
└── references/
├── format-guide.md # Format details
├── format-matrix.md # Compatibility matrix
├── conversion-guides.md # Step-by-step guides
├── format-support.md # Supported features
├── quick-reference.md # Cheat sheet
├── templates.md # Template documentation
└── troubleshooting.md # Problem solving
🐛 Troubleshooting
Common Issues
- "pandoc: command not found" → Install pandoc (see INSTALL.md)
- "pdflatex not found" → Install LaTeX distribution
- Unicode broken in PDF → Use
--pdf-engine=xelatex - Images missing → Check paths and use
--resource-path - EPUB validation fails → Run epubcheck for details
See references/troubleshooting.md for comprehensive solutions.
📖 References
INSTALL.md- Platform-specific installationreferences/format-guide.md- Format capabilities and limitationsreferences/conversion-guides.md- Step-by-step workflowsreferences/quick-reference.md- One-page cheat sheetreferences/templates.md- Template usage and customizationreferences/troubleshooting.md- Extended problem solving
🎯 Best Practices
- Use YAML frontmatter for metadata (title, author, date)
- Validate outputs before sharing (especially EPUB/PDF)
- Version control source (Markdown), not outputs
- Test templates first before batch processing
- Back up before batch operations
🚀 Performance
- Use
batch_convert.shfor parallel processing of multiple files - Cache templates in
~/.pandoc/templates/ - Use incremental builds (only reconvert changed files)
- For very large docs (>10MB), increase memory limits
📜 License
This skill is part of OpenClaw. Pandoc itself is GPL-licensed.
Quick Start: python scripts/convert.py input.md output.pdf
Batch Convert: ./scripts/batch_convert.sh *.md pdf ./output/
Validate: ./scripts/validate.sh output.pdf
Help: See README.md and references/ directory