doc-to-text

SKILL.md

Document to Text Reborn (Digital Archaeologist)

Overview

This skill utilizes a 3-layer extraction model to "excavate" meaning and aesthetics from various document formats. It separates pure content from design and metadata, enabling high-fidelity analysis and reuse.

3-Layer Extraction Model

  1. Content Layer (Soul): High-fidelity text extraction maintaining structural elements like headings and tables (Markdown output).
  2. Aesthetic Layer (Mask): Extraction of design parameters, colors, fonts, and layout grid information.
  3. Metadata Layer (Context): File properties, authorship, and contextual markers.

Supported Formats

  • PDF: Text and metadata. (Aesthetic: Coordinate-based analysis)
  • Word (.docx): Structural Markdown conversion. (Aesthetic: Style extraction)
  • Excel (.xlsx): Multi-sheet CSV extraction.
  • PowerPoint (.pptx): Slide-based content extraction.
  • Images: OCR supporting English and Japanese.

Usage

node dist/index.js <file_path> [options]

Options

  • --mode, -m: Extraction mode. Choices: content, aesthetic, metadata, all (default).
  • --out, -o: Save the structural JSON result to a file.

Examples

Extract only text (soul) as Markdown:

node dist/index.js report.pdf --mode content

Extract design/layout DNA (mask):

node dist/index.js brochure.docx --mode aesthetic

Dependencies

  • pdf-parse: Basic PDF text.
  • mammoth: Word-to-Markdown conversion.
  • xlsx: Excel data parsing.
  • tesseract.js: Image OCR.
Weekly Installs
17
GitHub Stars
1
First Seen
Feb 13, 2026
Installed on
cursor17
claude-code17
replit17
mcpjam16
openhands16
zencoder16