markitdown
markitdown - Document to Markdown
Convert local documents to clean Markdown. One tool for PDF, Word, Excel, PowerPoint, images, and more.
When to Use markitdown
| Use Case | Recommendation |
|---|---|
| Local files (PDF, Word, Excel) | ✅ Use markitdown - unique capability |
| Web pages | ❌ Use Jina (r.jina.ai/) - 5x faster |
| Blocked/anti-bot sites | ❌ Use Firecrawl |
| OCR on images | ✅ Use markitdown |
| Audio transcription | ✅ Use markitdown |
Basic Usage
# Local files (primary use case)
markitdown document.pdf
markitdown report.docx
markitdown data.xlsx
markitdown slides.pptx
markitdown screenshot.png # OCR
# URLs (works, but Jina is faster)
markitdown https://example.com
# Save output
markitdown document.pdf > document.md
Supported Formats
| Format | Extensions | Notes |
|---|---|---|
.pdf |
Text extraction, tables | |
| Word | .docx |
Formatting preserved |
| Excel | .xlsx |
Tables to markdown |
| PowerPoint | .pptx |
Slides as sections |
| Images | .jpg, .png |
OCR text extraction |
| HTML | .html |
Clean conversion |
| Audio | .mp3, .wav |
Speech-to-text |
| Text | .txt, .csv, .json, .xml |
Pass-through/structure |
| URLs | https://... |
Works but slower than Jina |
Benchmarked Performance (URLs)
| Tool | Avg Speed | Success Rate |
|---|---|---|
| Jina | 0.5s | 10/10 |
| markitdown | 2.5s | 9/10 |
| Firecrawl | 4.5s | 10/10 |
Verdict: For URLs, use Jina. For local files, markitdown is the only option.
Examples
# PDF to markdown (primary use case)
markitdown report.pdf > report.md
# Excel spreadsheet
markitdown financials.xlsx
# Image with text (OCR)
markitdown screenshot.png
# PowerPoint deck
markitdown presentation.pptx > slides.md
# Audio transcription
markitdown meeting.mp3 > transcript.md
Comparison with Alternatives
| Task | markitdown | Alternative |
|---|---|---|
| PDF text | markitdown file.pdf |
PyMuPDF, pdfplumber |
| Word docs | markitdown file.docx |
python-docx |
| Excel | markitdown file.xlsx |
pandas, openpyxl |
| OCR | markitdown image.png |
Tesseract |
| Web pages | Use Jina instead | r.jina.ai/URL (5x faster) |
markitdown's advantage: One CLI for all local document formats. No code needed.
More from 0xdarkmatter/claude-mods
file-search
Modern file and content search using fd, ripgrep (rg), and fzf. Triggers on: fd, ripgrep, rg, find files, search code, fzf, fuzzy find, search codebase.
160container-orchestration
Docker and Kubernetes patterns. Triggers on: Dockerfile, docker-compose, kubernetes, k8s, helm, pod, deployment, service, ingress, container, image.
76python-pytest-patterns
pytest testing patterns for Python. Triggers on: pytest, fixture, mark, parametrize, mock, conftest, test coverage, unit test, integration test, pytest.raises.
60python-env
Fast Python environment management with uv (10-100x faster than pip). Triggers on: uv, venv, pip, pyproject, python environment, install package, dependencies.
50data-processing
Process JSON with jq and YAML/TOML with yq. Filter, transform, query structured data efficiently. Triggers on: parse JSON, extract from YAML, query config, Docker Compose, K8s manifests, GitHub Actions workflows, package.json, filter data.
50sqlite-ops
Patterns for SQLite databases in Python projects - state management, caching, and async operations. Triggers on: sqlite, sqlite3, aiosqlite, local database, database schema, migration, wal mode.
48