ocr

SKILL.md

case.dev OCR

Production-grade document OCR with table extraction and word-level positional data. Processes PDFs and images (PNG, JPG, TIFF, BMP, WEBP) up to 500MB.

Requires the casedev CLI. See setup skill for installation and auth.

Process a Document

casedev ocr process --document-url "https://example.com/contract.pdf" --json

Flags: --document-url (required), --document-id (optional tag), --engine (override).

Returns a job ID and initial status.

Check Job Status

casedev ocr status JOB_ID --json

Statuses: queued -> processing -> completed or failed.

Watch Until Complete

casedev ocr watch JOB_ID --json

Flags: --interval (default: 3s), --timeout (default: 900s).

Word-Level Data

casedev ocr words --vault VAULT_ID --object OBJECT_ID --json

Requires the document to be in a vault with completed OCR ingestion.

Flags: --page (specific page), --word-start, --word-end (index range).

Returns per-page word arrays with text, word index, and confidence scores.

Common Workflows

OCR a vault document

# 1. Upload (triggers automatic ingestion + OCR)
casedev vault object upload ./scanned-contract.pdf --vault VAULT_ID --json

# 2. Check ingestion status
casedev vault object list --vault VAULT_ID --json

# 3. Get word-level data
casedev ocr words --vault VAULT_ID --object OBJECT_ID --json

OCR an external document

casedev ocr process --document-url "https://storage.example.com/doc.pdf" --json
casedev ocr watch JOB_ID --json

Troubleshooting

"Invalid file type for OCR": Only PDFs and images supported. Check content type with casedev vault object list.

Job stuck in "processing": Increase timeout with --timeout 1800. Large documents (100+ pages) take longer.

"OCR job failed": Document may be corrupted or unsupported. Re-upload and retry.

Weekly Installs
1
Repository
casemark/skills
GitHub Stars
5
First Seen
13 days ago
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1