word-document
Word Document Generation Skill
Produces publication-ready Word (.docx) documents from the markdown manuscript sections in this project using pandoc.
Prerequisites
- pandoc (>= 3.0): installed at
~/.local/bin/pandoc - If pandoc is missing, download from https://github.com/jgm/pandoc/releases
Manuscript Location
The manuscript lives at:
meta_analysis/bulk3/manuscript/
├── 05-manuscript/
│ ├── Abstract.md
│ ├── Introduction.md
│ ├── Methods.md
│ ├── Results.md
│ ├── Discussion.md
│ └── supplementary_methods.md
├── 03-figures/
│ ├── main/ # Figure_1 through Figure_5
│ ├── supplementary/ # Figure_S1 through Figure_S10
│ └── legends.md
├── 04-tables/
│ ├── main/Table_1.md
│ ├── supplementary/
│ └── legends.md
└── 06-references/
└── References.md
Target journal: Journal of Investigative Dermatology (JID)
- Word limit: 3,500 words (excluding abstract, references, figure legends)
- Max 6 figures/tables combined
Workflow
Step 1: Assemble a Combined Markdown File
Concatenate the manuscript sections into a single markdown file in the correct order. Use the Bash tool:
MANUSCRIPT_DIR="meta_analysis/bulk3/manuscript"
OUTPUT_DIR="${MANUSCRIPT_DIR}/output"
mkdir -p "$OUTPUT_DIR"
# Combine sections with page breaks between them
{
cat "${MANUSCRIPT_DIR}/05-manuscript/Abstract.md"
echo -e "\n\n\\newpage\n\n"
cat "${MANUSCRIPT_DIR}/05-manuscript/Introduction.md"
echo -e "\n\n\\newpage\n\n"
cat "${MANUSCRIPT_DIR}/05-manuscript/Results.md"
echo -e "\n\n\\newpage\n\n"
cat "${MANUSCRIPT_DIR}/05-manuscript/Discussion.md"
echo -e "\n\n\\newpage\n\n"
cat "${MANUSCRIPT_DIR}/05-manuscript/Methods.md"
echo -e "\n\n\\newpage\n\n"
cat "${MANUSCRIPT_DIR}/06-references/References.md"
echo -e "\n\n\\newpage\n\n"
echo "# Figure Legends"
echo ""
cat "${MANUSCRIPT_DIR}/03-figures/legends.md"
echo -e "\n\n\\newpage\n\n"
echo "# Table Legends"
echo ""
cat "${MANUSCRIPT_DIR}/04-tables/legends.md"
} > "${OUTPUT_DIR}/manuscript_combined.md"
Step 2: Embed Figures (Optional)
If the user wants figures embedded in the document (rather than at the end), insert markdown image references at the appropriate locations. Use standard markdown syntax:
{ width=100% }
For figures at the end (typical for journal submission), append a figures section:
{
echo "# Figures"
echo ""
for fig in "${MANUSCRIPT_DIR}/03-figures/main/"*.png; do
basename=$(basename "$fig" .png)
echo "{ width=100% }"
echo ""
echo "\\newpage"
echo ""
done
} >> "${OUTPUT_DIR}/manuscript_combined.md"
Step 3: Convert to Word with Pandoc
Run pandoc to produce the .docx file:
~/.local/bin/pandoc "${OUTPUT_DIR}/manuscript_combined.md" \
-o "${OUTPUT_DIR}/manuscript.docx" \
--from markdown \
--to docx \
--reference-doc="${MANUSCRIPT_DIR}/reference.docx" \
--resource-path="${MANUSCRIPT_DIR}" \
--wrap=none \
--standalone
If no reference.docx template exists, omit --reference-doc and pandoc will use its default styles:
~/.local/bin/pandoc "${OUTPUT_DIR}/manuscript_combined.md" \
-o "${OUTPUT_DIR}/manuscript.docx" \
--from markdown \
--to docx \
--resource-path="${MANUSCRIPT_DIR}" \
--wrap=none \
--standalone
Step 4: Verify Output
ls -lh "${OUTPUT_DIR}/manuscript.docx"
# Check word count (approximate)
~/.local/bin/pandoc "${OUTPUT_DIR}/manuscript_combined.md" --to plain | wc -w
Customization Options
Using a Reference Template
To apply custom styles (fonts, spacing, heading formats), create a reference document:
~/.local/bin/pandoc -o reference.docx --print-default-data-file reference.docx
Then edit reference.docx in Word to set styles (Normal, Heading 1-3, etc.) and pass it via --reference-doc.
Generating Supplementary Materials
For a separate supplementary document:
MANUSCRIPT_DIR="meta_analysis/bulk3/manuscript"
OUTPUT_DIR="${MANUSCRIPT_DIR}/output"
{
echo "# Supplementary Methods"
echo ""
cat "${MANUSCRIPT_DIR}/05-manuscript/supplementary_methods.md"
echo -e "\n\n\\newpage\n\n"
echo "# Supplementary Figures"
echo ""
for fig in "${MANUSCRIPT_DIR}/03-figures/supplementary/"*.png; do
basename=$(basename "$fig" .png)
echo "{ width=100% }"
echo ""
echo "\\newpage"
echo ""
done
} > "${OUTPUT_DIR}/supplementary_combined.md"
~/.local/bin/pandoc "${OUTPUT_DIR}/supplementary_combined.md" \
-o "${OUTPUT_DIR}/supplementary.docx" \
--from markdown \
--to docx \
--resource-path="${MANUSCRIPT_DIR}" \
--wrap=none \
--standalone
Line Numbering and Double Spacing
Many journals (including JID) require line numbers and double spacing. These are best set via a reference.docx template with double-spaced paragraph formatting. Line numbers can be added in Word after export (Layout > Line Numbers).
Tables from CSV
To include supplementary tables from CSV data, convert with pandoc or create markdown tables first:
# Convert a CSV to a markdown table using column/csvlook or a simple script
Rscript -e '
library(knitr)
df <- read.csv("meta_analysis/bulk3/results/tables/04_rra_combined_stat_6datasets.csv")
cat(kable(head(df, 30), format = "pipe"), sep = "\n")
' > "${OUTPUT_DIR}/table_s2.md"
Common Invocations
| User Request | Action |
|---|---|
| "Generate Word document" | Full manuscript assembly (Steps 1-4) |
| "Export manuscript to docx" | Same as above |
| "Create supplementary docx" | Supplementary materials only |
| "Compile manuscript with figures" | Embed figures inline |
| "Make submission-ready Word file" | Main + supplementary as separate files |
Troubleshooting
- Images not embedding: Ensure
--resource-pathpoints to the manuscript directory - Pandoc not found: Check
~/.local/bin/pandocor install via the README - Table formatting issues: Use pipe tables (
| col1 | col2 |) for best pandoc compatibility - Unicode issues: Add
--metadata lang=enif special characters cause problems