pdf-conversion

SKILL.md

PDF Conversion Skill

Overview

Converts PDF files to clean, readable Markdown format with proper formatting, image extraction, and table preservation.

When to Use

  • "Convert this PDF to markdown"
  • "Extract text from PDF"
  • "Transform PDF to MD format"
  • "Process PDFs in folder X"

Related Resources

  • SOP: Operator Team OS/1. SOPs/convert_pdfs_to_markdown.md

Scripts

Script Purpose
pdf_to_md.py Main PDF to Markdown converter

Usage

python3 scripts/pdf_to_md.py /path/to/input.pdf /path/to/output.md

Features

  • Text extraction with OCR fallback
  • Image extraction and embedding
  • Table detection and formatting
  • Heading structure preservation
  • Automatic cleanup of formatting artifacts

Edge Cases

  • Scanned PDFs: Falls back to OCR
  • Complex tables: Best-effort conversion
  • Encrypted PDFs: Will fail (requires password)
Weekly Installs
6
GitHub Stars
11
First Seen
Feb 9, 2026
Installed on
github-copilot6
codex6
kimi-cli6
gemini-cli6
amp6
opencode6