docx-to-markdown
DOCX to Markdown Skill
Convert DOCX files to well-formatted markdown files with images extracted.
Usage
python3 docx_to_markdown.py "<docx_path>" -o "<output_path>"
Features
- Metadata extraction: title, author, subject, keywords
- Heading preservation: Maintains H1-H6 hierarchy from Word styles
- Inline formatting: Bold, italic conversion
- List support: Ordered and unordered lists
- Table extraction: Tables converted to markdown format
- Image extraction: Extracts images to
_files_/folder with document prefix
Output Structure
output_dir/
├── document.docx # Original file
├── document.md # Extracted markdown
└── _files_/ # Images folder
├── DocTitle_image1.png
├── DocTitle_figure2.jpg
└── ...
Image Prefix
Images are extracted with a document prefix derived from the title:
- Title: "Die Empty: Unleash Your Best Work Every Day"
- Prefix:
DieEmptyUnleash(first 3 words, special chars removed) - Image:
_files_/DieEmptyUnleash_image1.png
This prevents filename collisions when extracting multiple documents to the same folder.
Output Format
---
title: {from metadata}
author: {from metadata}
source_file: original.docx
source_type: docx
extracted: YYYY-MM-DD HH:MM:SS
status: extracted
---
# {Document Title}
{content with  links}
Dependencies
pip install python-docx
Or use requirements.txt:
pip install -r requirements.txt
Options
| Flag | Description |
|---|---|
-o, --output |
Output markdown file path (default: same as docx with .md) |
-q, --quiet |
Suppress progress messages |
Limitations
- Password-protected DOCX: Cannot be opened (will fail with error)
- Complex layouts: May not preserve exact positioning
- Embedded objects: Non-image objects may not be extracted
More from zpankz/obsidian-skills
viva-llm
Use VIVA LLM for multi-provider chat, voice calls, terminal integration, assistants, skills, MCP tools, and agent mode inside Obsidian. Trigger when the user mentions VIVA LLM, voice chat, realtime voice, LLM providers in Obsidian, or vault-integrated AI chat.
1obsidian-plugin-accessibility
Use this skill when building or reviewing Obsidian plugin UI for keyboard access, ARIA labels, screen reader support, focus handling, or mobile touch targets. Accessibility is treated as mandatory, not optional.
1tasks
Create and query tasks using the Tasks plugin syntax including due dates, recurrence, priorities, and task queries. Use when the user mentions Tasks plugin, recurring tasks, task queries, or advanced task management in Obsidian.
1dataview
Create Dataview queries using DQL (Dataview Query Language), inline queries, and DataviewJS. Use when the user mentions Dataview, DQL, querying notes, listing notes by metadata, or creating dynamic views of vault content.
1defuddle
Extract clean markdown from web pages using Defuddle CLI, removing clutter to save tokens. Use when the user provides a URL to read or analyze.
1datacore
Create Datacore views using JSX/React syntax and the dc.* API. Use when the user mentions Datacore, dc.useQuery, JSX views, or React-based vault queries. Datacore is the successor to Dataview with better performance and interactive views.
1