docx
DOCX creation, editing, and analysis
Overview
A .docx file is a ZIP archive containing XML files.
Quick Reference
| Task | Approach |
|---|---|
| Read/analyze content | pandoc or unpack for raw XML |
| Create new document | Use docx-js - see Creating New Documents below |
| Edit existing document | Unpack → edit XML → repack - see Editing Existing Documents below |
Converting .doc to .docx
Legacy .doc files must be converted before editing:
python scripts/office/soffice.py --headless --convert-to docx document.doc
Reading Content
# Text extraction with tracked changes
pandoc --track-changes=all document.docx -o output.md
# Raw XML access
python scripts/office/unpack.py document.docx unpacked/
Converting to Images
python scripts/office/soffice.py --headless --convert-to pdf document.docx
pdftoppm -jpeg -r 150 document.pdf page
Accepting Tracked Changes
To produce a clean document with all tracked changes accepted (requires LibreOffice):
python scripts/accept_changes.py input.docx output.docx
Creating New Documents
Use python-docx to create .docx files (auto-installed, no extra setup needed).
Basic Document
from docx import Document
from docx.shared import Pt, Inches, Cm, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
doc = Document()
# Set default font
style = doc.styles['Normal']
font = style.font
font.name = 'Arial'
font.size = Pt(11)
# Add content
doc.add_heading('Document Title', level=0)
doc.add_paragraph('This is a paragraph of text.')
doc.add_heading('Section 1', level=1)
doc.add_paragraph('Section content here.')
doc.save('output.docx')
Paragraphs & Text Formatting
from docx.shared import Pt, RGBColor
# Bold, italic, underline
p = doc.add_paragraph()
run = p.add_run('Bold text')
run.bold = True
run = p.add_run(' and ')
run = p.add_run('italic text')
run.italic = True
# Font color and size
run = p.add_run('Colored text')
run.font.color.rgb = RGBColor(0xFF, 0x00, 0x00) # Red
run.font.size = Pt(14)
# Alignment
p.alignment = WD_ALIGN_PARAGRAPH.CENTER
Lists
# Bullet list
doc.add_paragraph('First item', style='List Bullet')
doc.add_paragraph('Second item', style='List Bullet')
# Numbered list
doc.add_paragraph('Step one', style='List Number')
doc.add_paragraph('Step two', style='List Number')
Tables
from docx.shared import Inches, Pt, RGBColor
from docx.oxml.ns import qn
table = doc.add_table(rows=3, cols=3, style='Table Grid')
# Header row
header = table.rows[0]
for i, text in enumerate(['Name', 'Role', 'Email']):
cell = header.cells[i]
cell.text = text
# Bold header
for paragraph in cell.paragraphs:
for run in paragraph.runs:
run.bold = True
# Data rows
data = [['Alice', 'Engineer', 'alice@example.com'],
['Bob', 'Designer', 'bob@example.com']]
for i, row_data in enumerate(data):
for j, text in enumerate(row_data):
table.rows[i+1].cells[j].text = text
# Set column widths
for row in table.rows:
row.cells[0].width = Inches(2)
row.cells[1].width = Inches(2)
row.cells[2].width = Inches(3)
Images
doc.add_picture('image.png', width=Inches(4))
# Centered image
paragraph = doc.paragraphs[-1]
paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER
Headers & Footers
from docx.shared import Pt
section = doc.sections[0]
# Header
header = section.header
header_para = header.paragraphs[0]
header_para.text = 'Company Name'
header_para.alignment = WD_ALIGN_PARAGRAPH.RIGHT
# Footer with page numbers
footer = section.footer
footer_para = footer.paragraphs[0]
footer_para.alignment = WD_ALIGN_PARAGRAPH.CENTER
# Add page number field
run = footer_para.add_run()
fldChar = OxmlElement('w:fldChar')
fldChar.set(qn('w:fldCharType'), 'begin')
run._r.append(fldChar)
run2 = footer_para.add_run()
instrText = OxmlElement('w:instrText')
instrText.text = ' PAGE '
run2._r.append(instrText)
run3 = footer_para.add_run()
fldChar2 = OxmlElement('w:fldChar')
fldChar2.set(qn('w:fldCharType'), 'end')
run3._r.append(fldChar2)
Page Setup
from docx.shared import Inches, Cm
from docx.enum.section import WD_ORIENT
section = doc.sections[0]
# US Letter size
section.page_width = Inches(8.5)
section.page_height = Inches(11)
# Margins
section.top_margin = Inches(1)
section.bottom_margin = Inches(1)
section.left_margin = Inches(1)
section.right_margin = Inches(1)
# Landscape
section.orientation = WD_ORIENT.LANDSCAPE
section.page_width, section.page_height = section.page_height, section.page_width
Advanced: Creating with docx-js (requires Node.js)
For advanced features like Table of Contents with auto-generation, complex numbering, or features not supported by python-docx, use docx-js:
npm install -g docx # One-time setup
const { Document, Packer, Paragraph, TextRun, HeadingLevel, TableOfContents } = require('docx');
const doc = new Document({
sections: [{
properties: {
page: {
size: { width: 12240, height: 15840 }, // US Letter in DXA
margin: { top: 1440, right: 1440, bottom: 1440, left: 1440 }
}
},
children: [
new TableOfContents("Table of Contents", { hyperlink: true, headingStyleRange: "1-3" }),
new Paragraph({ heading: HeadingLevel.HEADING_1, children: [new TextRun("Title")] }),
]
}]
});
Validation
After creating the file, validate it. If validation fails, unpack, fix the XML, and repack.
python scripts/office/validate.py doc.docx
Editing Existing Documents
Follow all 3 steps in order.
Step 1: Unpack
python scripts/office/unpack.py document.docx unpacked/
Extracts XML, pretty-prints, merges adjacent runs, and converts smart quotes to XML entities (“ etc.) so they survive editing. Use --merge-runs false to skip run merging.
Step 2: Edit XML
Edit files in unpacked/word/. See XML Reference below for patterns.
Use "Claude" as the author for tracked changes and comments, unless the user explicitly requests use of a different name.
Use the Edit tool directly for string replacement. Do not write Python scripts. Scripts introduce unnecessary complexity. The Edit tool shows exactly what is being replaced.
CRITICAL: Use smart quotes for new content. When adding text with apostrophes or quotes, use XML entities to produce smart quotes:
<!-- Use these entities for professional typography -->
<w:t>Here’s a quote: “Hello”</w:t>
| Entity | Character |
|---|---|
‘ |
‘ (left single) |
’ |
’ (right single / apostrophe) |
“ |
“ (left double) |
” |
” (right double) |
Adding comments: Use comment.py to handle boilerplate across multiple XML files (text must be pre-escaped XML):
python scripts/comment.py unpacked/ 0 "Comment text with & and ’"
python scripts/comment.py unpacked/ 1 "Reply text" --parent 0 # reply to comment 0
python scripts/comment.py unpacked/ 0 "Text" --author "Custom Author" # custom author name
Then add markers to document.xml (see Comments in XML Reference).
Step 3: Pack
python scripts/office/pack.py unpacked/ output.docx --original document.docx
Validates with auto-repair, condenses XML, and creates DOCX. Use --validate false to skip.
Auto-repair will fix:
durableId>= 0x7FFFFFFF (regenerates valid ID)- Missing
xml:space="preserve"on<w:t>with whitespace
Auto-repair won't fix:
- Malformed XML, invalid element nesting, missing relationships, schema violations
Common Pitfalls
- Replace entire
<w:r>elements: When adding tracked changes, replace the whole<w:r>...</w:r>block with<w:del>...<w:ins>...as siblings. Don't inject tracked change tags inside a run. - Preserve
<w:rPr>formatting: Copy the original run's<w:rPr>block into your tracked change runs to maintain bold, font size, etc.
XML Reference
Schema Compliance
- Element order in
<w:pPr>:<w:pStyle>,<w:numPr>,<w:spacing>,<w:ind>,<w:jc>,<w:rPr>last - Whitespace: Add
xml:space="preserve"to<w:t>with leading/trailing spaces - RSIDs: Must be 8-digit hex (e.g.,
00AB1234)
Tracked Changes
Insertion:
<w:ins w:id="1" w:author="Claude" w:date="2025-01-01T00:00:00Z">
<w:r><w:t>inserted text</w:t></w:r>
</w:ins>
Deletion:
<w:del w:id="2" w:author="Claude" w:date="2025-01-01T00:00:00Z">
<w:r><w:delText>deleted text</w:delText></w:r>
</w:del>
Inside <w:del>: Use <w:delText> instead of <w:t>, and <w:delInstrText> instead of <w:instrText>.
Minimal edits - only mark what changes:
<!-- Change "30 days" to "60 days" -->
<w:r><w:t>The term is </w:t></w:r>
<w:del w:id="1" w:author="Claude" w:date="...">
<w:r><w:delText>30</w:delText></w:r>
</w:del>
<w:ins w:id="2" w:author="Claude" w:date="...">
<w:r><w:t>60</w:t></w:r>
</w:ins>
<w:r><w:t> days.</w:t></w:r>
Deleting entire paragraphs/list items - when removing ALL content from a paragraph, also mark the paragraph mark as deleted so it merges with the next paragraph. Add <w:del/> inside <w:pPr><w:rPr>:
<w:p>
<w:pPr>
<w:numPr>...</w:numPr> <!-- list numbering if present -->
<w:rPr>
<w:del w:id="1" w:author="Claude" w:date="2025-01-01T00:00:00Z"/>
</w:rPr>
</w:pPr>
<w:del w:id="2" w:author="Claude" w:date="2025-01-01T00:00:00Z">
<w:r><w:delText>Entire paragraph content being deleted...</w:delText></w:r>
</w:del>
</w:p>
Without the <w:del/> in <w:pPr><w:rPr>, accepting changes leaves an empty paragraph/list item.
Rejecting another author's insertion - nest deletion inside their insertion:
<w:ins w:author="Jane" w:id="5">
<w:del w:author="Claude" w:id="10">
<w:r><w:delText>their inserted text</w:delText></w:r>
</w:del>
</w:ins>
Restoring another author's deletion - add insertion after (don't modify their deletion):
<w:del w:author="Jane" w:id="5">
<w:r><w:delText>deleted text</w:delText></w:r>
</w:del>
<w:ins w:author="Claude" w:id="10">
<w:r><w:t>deleted text</w:t></w:r>
</w:ins>
Comments
After running comment.py (see Step 2), add markers to document.xml. For replies, use --parent flag and nest markers inside the parent's.
CRITICAL: <w:commentRangeStart> and <w:commentRangeEnd> are siblings of <w:r>, never inside <w:r>.
<!-- Comment markers are direct children of w:p, never inside w:r -->
<w:commentRangeStart w:id="0"/>
<w:del w:id="1" w:author="Claude" w:date="2025-01-01T00:00:00Z">
<w:r><w:delText>deleted</w:delText></w:r>
</w:del>
<w:r><w:t> more text</w:t></w:r>
<w:commentRangeEnd w:id="0"/>
<w:r><w:rPr><w:rStyle w:val="CommentReference"/></w:rPr><w:commentReference w:id="0"/></w:r>
<!-- Comment 0 with reply 1 nested inside -->
<w:commentRangeStart w:id="0"/>
<w:commentRangeStart w:id="1"/>
<w:r><w:t>text</w:t></w:r>
<w:commentRangeEnd w:id="1"/>
<w:commentRangeEnd w:id="0"/>
<w:r><w:rPr><w:rStyle w:val="CommentReference"/></w:rPr><w:commentReference w:id="0"/></w:r>
<w:r><w:rPr><w:rStyle w:val="CommentReference"/></w:rPr><w:commentReference w:id="1"/></w:r>
Images
- Add image file to
word/media/ - Add relationship to
word/_rels/document.xml.rels:
<Relationship Id="rId5" Type=".../image" Target="media/image1.png"/>
- Add content type to
[Content_Types].xml:
<Default Extension="png" ContentType="image/png"/>
- Reference in document.xml:
<w:drawing>
<wp:inline>
<wp:extent cx="914400" cy="914400"/> <!-- EMUs: 914400 = 1 inch -->
<a:graphic>
<a:graphicData uri=".../picture">
<pic:pic>
<pic:blipFill><a:blip r:embed="rId5"/></pic:blipFill>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing>
Dependencies
Auto-installed (via metadata.dependencies):
- python-docx: Creating and editing documents (default)
- lxml: XML processing
- defusedxml: Secure XML parsing
Manual install (for advanced features):
- docx (npm):
npm install -g docx— advanced document creation with TOC support - pandoc: Text extraction —
brew install pandoc(macOS) /apt install pandoc(Linux) - LibreOffice: PDF conversion, accepting tracked changes —
brew install --cask libreoffice(macOS) /apt install libreoffice(Linux) - Poppler:
pdftoppmfor images —brew install poppler(macOS) /apt install poppler-utils(Linux)
More from memento-teams/memento-skills
filesystem
Direct filesystem operations (read, write, edit, list, search files). Use for any file manipulation tasks.
12web-search
Web search and content fetching. Use when the user needs to search the web for information or fetch content from URLs.
9skill-creator
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
9pptx
Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions \"deck,\" \"slides,\" \"presentation,\" or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.
8pdf
Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.
8xlsx
Open, create, read, analyze, edit, or validate Excel/spreadsheet files (.xlsx, .xlsm, .csv, .tsv). Use when the user asks to create, build, modify, analyze, read, validate, or format any Excel spreadsheet, financial model, pivot table, or tabular data file. Covers: creating new xlsx from scratch, reading and analyzing existing files, editing existing xlsx with zero format loss, formula recalculation and validation, and applying professional financial formatting standards. Triggers on 'spreadsheet', 'Excel', '.xlsx', '.csv', 'pivot table', 'financial model', 'formula', or any request to produce tabular data in Excel format.
8