local-ocr

Installation

SKILL.md

Local OCR Pipeline Skill

Robust Optical Character Recognition (OCR) pipeline driven by ocrmypdf and tesseract. Handles scanned PDFs, rotated image inputs, and raw text extraction securely and locally without external APIs.

Why not GPU via PyTorch/EasyOCR? The ocrmypdf tool is the industry standard for producing searchable PDFs. It leverages tesseract for pixel-accurate text placement. A pure-CPU pipeline is leaner (avoids a 1.5GB PyTorch payload) and reliably embeds text exactly where it appears in the scanned image.

Capabilities

Searchable PDF Generation: Converts rasterized/scanned PDFs or raw images (.jpg, .png, etc.) into PDFs with a selectable, searchable text layer.
Auto-Rotation & Deskew: Automatically detects incorrectly rotated text and straightens crooked scans.
Idempotent In-Place Processing: Safely processes files in-place using --skip-text, preventing double-processing of a PDF that already has embedded text.
Structured JSON Output: All commands output structured JSON, making failure states (like missing dependencies) parseable by agents.
Raw Text Extraction: Raw string extraction fallback for when agents need text directly in-memory instead of a PDF file.

Setup

# Installs system dependencies (tesseract, ocrmypdf, ghostscript) and sets up isolated venv
bash skills/ocr/scripts/setup.sh

Usage

uv run --project ~/.local-ocr scripts/ocr.py <command>

1. Generate a Searchable PDF (`pdf`)

Produces a standard, layered PDF. If you give it an image, it wraps it in a PDF. If you give it a scanned PDF, it adds the invisible text layer.

# Overwrites the file in-place, skipping it safely if it already contains text
uv run --project ~/.local-ocr scripts/ocr.py pdf ./scanned_invoice.pdf

# Output to a different file
uv run --project ~/.local-ocr scripts/ocr.py pdf ./scan_001.png -o ./contract.pdf

# Force reprocessing (ignore existing text layer)
uv run --project ~/.local-ocr scripts/ocr.py pdf ./scanned_invoice.pdf --force

Note: By default, auto-rotate and deskew are enabled. Disable with --no-rotate or --no-deskew.

2. Batch Process a Directory (`batch`)

Recursively scans a directory for images and PDFs, applying OCR.

# Process all files. Skips already-OCRed PDFs.
uv run --project ~/.local-ocr scripts/ocr.py batch ./archives/

3. Extract Raw Text (`text`)

Does not create a PDF. Just reads the words off the page and returns them as a JSON string. Good for agents reading documents on the fly.

uv run --project ~/.local-ocr scripts/ocr.py text ./han_solo_invoice.png

Franchise Examples (Star Wars)

Process the Death Star blueprints: uv run --project ~/.local-ocr scripts/ocr.py pdf ./ds-1_schematics.pdf
Extract raw orders: uv run --project ~/.local-ocr scripts/ocr.py text ./order_66_memo.jpg
Archive run: uv run --project ~/.local-ocr scripts/ocr.py batch /archives/jedi_temple

Troubleshooting

File already contains text: This is the most common "error", but it isn't an error. ocrmypdf returns exit code 6 when it skips a file that already has text. The wrapper script catches this and reports a JSON "status": "success" with a message noting the side-step.
Dependencies Missing: Run the setup.sh script again if the agent complains about missing tesseract or Python modules.

Related skills

More from baphomet480/claude-skills

Installs

Repository

baphomet480/cla…e-skills

First Seen

Feb 28, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass

local-ocr

Local OCR Pipeline Skill

Capabilities

Setup

Usage

1. Generate a Searchable PDF (`pdf`)

2. Batch Process a Directory (`batch`)

3. Extract Raw Text (`text`)

Franchise Examples (Star Wars)

Troubleshooting

More from baphomet480/claude-skills

kitchen-sink-design-system

deep-research

design-lookup

nextjs-tinacms

cloudflare-pages

hugo-sveltia-cms

local-ocr

Local OCR Pipeline Skill

Capabilities

Setup

Usage

1. Generate a Searchable PDF (pdf)

2. Batch Process a Directory (batch)

3. Extract Raw Text (text)

Franchise Examples (Star Wars)

Troubleshooting

More from baphomet480/claude-skills

kitchen-sink-design-system

deep-research

design-lookup

nextjs-tinacms

cloudflare-pages

hugo-sveltia-cms

1. Generate a Searchable PDF (`pdf`)

2. Batch Process a Directory (`batch`)

3. Extract Raw Text (`text`)