doc-to-text
Installation
SKILL.md
Document to Text Reborn (Digital Archaeologist)
Overview
This skill utilizes a 3-layer extraction model to "excavate" meaning and aesthetics from various document formats. It separates pure content from design and metadata, enabling high-fidelity analysis and reuse.
3-Layer Extraction Model
- Content Layer (Soul): High-fidelity text extraction maintaining structural elements like headings and tables (Markdown output).
- Aesthetic Layer (Mask): Extraction of design parameters, colors, fonts, and layout grid information.
- Metadata Layer (Context): File properties, authorship, and contextual markers.
Supported Formats
- PDF: Text and metadata. (Aesthetic: Coordinate-based analysis)
- Word (
.docx): Structural Markdown conversion. (Aesthetic: Style extraction) - Excel (
.xlsx): Multi-sheet CSV extraction. - PowerPoint (
.pptx): Slide-based content extraction. - Images: OCR supporting English and Japanese.
Usage
node dist/index.js <file_path> [options]
Options
--mode, -m: Extraction mode. Choices:content,aesthetic,metadata,all(default).--out, -o: Save the structural JSON result to a file.
Examples
Extract only text (soul) as Markdown:
node dist/index.js report.pdf --mode content
Extract design/layout DNA (mask):
node dist/index.js brochure.docx --mode aesthetic
Dependencies
pdf-parse: Basic PDF text.mammoth: Word-to-Markdown conversion.xlsx: Excel data parsing.tesseract.js: Image OCR.
Related skills
More from famaoai-creator/gemini-skills
data-transformer
Convert between CSV, JSON, and YAML formats.
23pmo-governance-lead
Output file path
21completeness-scorer
Evaluate text completeness based on criteria.
21local-reviewer
Retrieves git diff of staged files for pre-commit AI code review.
21api-fetcher
Fetch data from REST/GraphQL APIs securely.
21prompt-optimizer
Optional output file path
21