docx-to-md
Installation
SKILL.md
Word to Markdown Conversion
Ingest Word documents into your Markdown workflow
Convert .docx files into clean, linted Markdown with extracted images, normalized headings, and cleaned table formatting. Reverse converter for ingesting external documents.
When to Use
- Importing Word documents from stakeholders into a Markdown-based workflow
- Converting legacy documentation to Markdown for version control
- Extracting content from .docx for further processing (presentations, email, web)
- Onboarding external resources (SOWs, RFPs, specs) into project repositories
Key Features
| Feature | Details |
|---|---|
| Image extraction | Embedded images saved to images/ folder with sequential naming |
| Pandoc cleanup | Removes escaped brackets, span classes, trailing backslashes |
| Table normalization | Aligns columns, adds proper separators |
| Heading fix | Normalizes hierarchy to start at H1 |
| Frontmatter | Optional YAML frontmatter with title and date |
| Comment stripping | Removes Word review comments |
Usage
# Basic conversion
node .github/muscles/docx-to-md.cjs report.docx
# With frontmatter and heading normalization
node .github/muscles/docx-to-md.cjs spec.docx --add-frontmatter --fix-headings
# Strip review comments
node .github/muscles/docx-to-md.cjs reviewed.docx --strip-comments
# Custom output path
node .github/muscles/docx-to-md.cjs input.docx output/document.md
# Debug mode (keeps raw pandoc output)
node .github/muscles/docx-to-md.cjs input.docx --debug
Post-Processing Pipeline
- pandoc converts .docx to raw Markdown
- Pandoc quirks cleaned (escaped brackets, attributes, etc.)
- Word comments stripped (optional)
- Heading hierarchy normalized (optional)
- Table formatting cleaned
- Images extracted to
images/folder - YAML frontmatter generated (optional)
Requirements
- Node.js 18+
- pandoc (
winget install pandoc)
Muscle Script
.github/muscles/docx-to-md.cjs
Related Skills
- md-to-word -- Reverse direction (Markdown to Word)
- lint-clean-markdown -- Post-validate converted Markdown
- md-scaffold -- Template for structuring imported content
Related skills