skills/skills.volces.com/pdf-text-extractor

pdf-text-extractor

SKILL.md

PDF-Text-Extractor - Extract Text from PDFs

Vernox Utility Skill - Perfect for document digitization.

Overview

PDF-Text-Extractor is a zero-dependency tool for extracting text content from PDF files. Supports both embedded text extraction (for text-based PDFs) and OCR (for scanned documents).

Features

✅ Text Extraction

  • Extract text from PDFs without external tools
  • Support for both text-based and scanned PDFs
  • Preserve document structure and formatting
  • Fast extraction (milliseconds for text-based)

✅ OCR Support

  • Use Tesseract.js for scanned documents
  • Support multiple languages (English, Spanish, French, German)
Installs
21
First Seen
Mar 28, 2026