pdf-extractor

SKILL.md

PDF Extractor

Extract text, metadata, and content from PDF documents.

Features

  • text: Extract all text content from PDF
  • metadata: Extract document properties (title, author, creation date, etc.)
  • pages: Get total page count
  • selective: Extract specific page ranges
  • info: Get PDF version and encryption status

Usage

# Extract all text
./scripts/pdf-extract.js --file ./document.pdf

# Extract specific pages
./scripts/pdf-extract.js --file ./document.pdf --pages "1-5"

# Extract metadata only
./scripts/pdf-extract.js --file ./document.pdf --metadata

# Get page count and info
./scripts/pdf-extract.js --file ./document.pdf --info

# Output as JSON
./scripts/pdf-extract.js --file ./document.pdf --format json

Examples

Task Command Output
Extract text pdf-extract.js --file doc.pdf Full text content
Pages 1-3 pdf-extract.js --file doc.pdf --pages 1-3 Text from pages 1-3
Metadata pdf-extract.js --file doc.pdf --metadata Document properties
Info pdf-extract.js --file doc.pdf --info Page count, version
JSON output pdf-extract.js --file doc.pdf --json Structured JSON

Notes

  • Supports most PDF formats (PDF 1.0-1.7)
  • Handles encrypted PDFs (prompts for password)
  • Memory efficient for large documents
  • Preserves text layout where possible
Weekly Installs
3
First Seen
Mar 1, 2026
Installed on
opencode3
gemini-cli3
claude-code3
github-copilot3
codex3
kimi-cli3