markitdown-skill

Installation
SKILL.md

MarkItDown Skill

Documentation and utilities for converting documents to Markdown using Microsoft's MarkItDown library.

Note: This skill provides documentation and a batch script. The actual conversion is done by the markitdown CLI/library installed via pip.

When to Use

Use markitdown for:

  • 📄 Fetching documentation (README, API docs)
  • 🌐 Converting web pages to markdown
  • 📝 Document analysis (PDFs, Word, PowerPoint)
  • 🎬 YouTube transcripts
  • 🖼️ Image text extraction (OCR)
  • 🎤 Audio transcription

Quick Start

# Convert file to markdown
markitdown document.pdf -o output.md

# Convert URL
markitdown https://example.com/docs -o docs.md

Supported Formats

Format Features
PDF Text extraction, structure
Word (.docx) Headings, lists, tables
PowerPoint Slides, text
Excel Tables, sheets
Images OCR + EXIF metadata
Audio Speech transcription
HTML Structure preservation
YouTube Video transcription

Installation

The skill requires Microsoft's markitdown CLI:

pip install 'markitdown[all]'

Or install specific formats only:

pip install 'markitdown[pdf,docx,pptx]'

Common Patterns

Fetch Documentation

markitdown https://github.com/user/repo/blob/main/README.md -o readme.md

Convert PDF

markitdown document.pdf -o document.md

Batch Convert

# Using included script
python ~/.openclaw/skills/markitdown/scripts/batch_convert.py docs/*.pdf -o markdown/ -v

# Or shell loop
for file in docs/*.pdf; do
  markitdown "$file" -o "${file%.pdf}.md"
done

Python API

from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("document.pdf")
print(result.text_content)

Troubleshooting

"markitdown not found"

pip install 'markitdown[all]'

OCR Not Working

# Ubuntu/Debian
sudo apt-get install tesseract-ocr

# macOS
brew install tesseract

What This Skill Provides

Component Source
markitdown CLI Microsoft's pip package
markitdown Python API Microsoft's pip package
scripts/batch_convert.py This skill (utility)
Documentation This skill

See Also

Installs
9
GitHub Stars
1
First Seen
Mar 10, 2026