openocr-skills
SKILL.md
OpenOCR Skill
Overview
This skill enables intelligent text extraction, document parsing, and universal recognition using OpenOCR - an accurate and efficient general OCR system. It provides a unified interface for text detection, text recognition, end-to-end OCR, VLM-based universal recognition (text/formulas/tables), and document parsing with layout analysis. Supports Chinese, English, and more.
How to Use
- Provide the image, scanned document, or PDF
- Optionally specify the task type (det/rec/ocr/unirec/doc)
- I'll extract text, formulas, tables, or full document structure
Example prompts:
- "Extract all text from this image"
- "Detect text regions in this photo"
- "Recognize the formula in this screenshot"
- "Parse this PDF document with layout analysis"
- "Convert this scanned page to Markdown"