academic-latex-pipeline
Academic LaTeX Pipeline
Converts academic survey Markdown (often from Obsidian) into polished LaTeX PDFs. The pipeline has five phases, each with a decision gate before proceeding.
When to Use
- User has a
.mdsurvey/paper and wants a PDF - User wants to fix formatting in an existing LaTeX-compiled PDF
- User needs Korean font support in LaTeX (XeLaTeX + Noto Sans CJK KR)
- User wants to replace Mermaid diagrams with TikZ figures
- User mentions
build_latex.pyor survey compilation - User wants to restructure a LaTeX project into section-based folders
Phase Overview
Phase 1: Content Quality → iterative-academic-writing skill, Critical=0 to pass
Phase 2: LaTeX Build → MD→TEX→PDF pipeline with Korean fonts
Phase 3: Format Review → Page-by-page visual inspection, fix overflows
Phase 4: Figure Validation → TikZ rendering, captions, sizing
Phase 5: Git Management → latex-project-manager skill for structure + push
Phase 1: Content Writing Loop
Invoke iterative-academic-writing skill on the source .md file. The skill applies 14 academic writing principles with FactBase verification and hallucination detection.
Gate: Critical issues = 0 → proceed to Phase 2.
This phase ensures content quality before expensive LaTeX processing. Don't skip it — fixing content errors after PDF generation wastes time.
Phase 2: LaTeX Build Pipeline
2.1 Project Structure
Projects use section-based folder organization for maintainability:
ProjectName/
├── main.tex # Shared preamble + project selector switch
├── <project>/
│ ├── content.tex # \input orchestrator for all sections
│ ├── refs.bib # BibTeX bibliography (NOT inline thebibliography)
│ ├── figures/ # Images and generated figures
│ │ └── .gitkeep
│ └── sections/
│ ├── 00_frontmatter.tex # \title, \author, \maketitle, \abstract
│ ├── 01_background.tex # Each \section in its own file
│ ├── ...
│ └── NN_bibliography.tex # \bibliographystyle + \bibliography
├── .gitignore
└── build_and_compile.sh # Optional: shell wrapper for compilation
For MD→TEX projects (Obsidian source), also include:
├── SourceDocument.md # Obsidian source (excluded from git)
└── build_latex.py # Python build script (MD → TEX → PDF)
Multi-project layout: Use \newcommand{\professor}{project_name} in main.tex to switch between projects sharing the same preamble. See latex-project-manager skill for details.
2.2 Bibliography Management (CRITICAL)
Always use BibTeX .bib files. Never use inline \begin{thebibliography}.
- Create
refs.bibin the project folder root - Use
\bibliographystyle{plainnat}+\bibliography{<project>/refs} - All references must use
\citep{}or\citet{}— no plain text "(Author, Year)" - 3-pass compilation:
pdflatex → bibtex → pdflatex → pdflatex
2.3 Build Script (build_latex.py)
The build script handles the full MD→TEX transformation:
-
Preprocess MD: Strip wikilinks
[[...]], remove Obsidian YAML frontmatter, clean tags -
Pandoc conversion:
pandoc input.md -f markdown -t latex -
Inject preamble with Korean font support:
\usepackage{fontspec} \usepackage{ucharclasses} \setmainfont{Noto Sans CJK KR} \newfontfamily\hangulfont{Noto Sans CJK KR} \setTransitionsForCJK{\hangulfont}{}{}Why
ucharclassesinstead ofxeCJK? ThexeCJKpackage requiresctexhook.stywhich is missing from many LaTeX distributions.ucharclassesis more portable. -
Replace Mermaid blocks with TikZ figures
-
Wrap examples in
tcolorboxenvironments -
Inject citations: Match
PaperName (Year)→\cite{key_year} -
Fix tables: Use
p{Xcm}columns instead ofl/c/rto prevent overflow
2.4 Font Installation
mkdir -p ~/.local/share/fonts
# Download Noto Sans CJK KR from github.com/googlei18n/noto-cjk/releases
fc-cache -fv ~/.local/share/fonts/
2.5 Compilation (3-pass)
pdflatex -interaction=nonstopmode main.tex # Pass 1 (or xelatex for Korean)
bibtex main # Citations
pdflatex -interaction=nonstopmode main.tex # Pass 2 (resolve refs)
pdflatex -interaction=nonstopmode main.tex # Pass 3 (final)
2.6 Overfull Hbox Prevention (Preamble)
\tolerance=1000
\emergencystretch=3em
\hfuzz=2pt
Gate: Compilation succeeds without errors → proceed to Phase 3.
Phase 3: Format Review Loop
Review the PDF page by page. Check for:
Critical (must fix, loop back):
- Table overflow beyond margins
- Missing or blank figures
- Unreadable/clipped text
[[wikilink]]artifacts surviving preprocessing- Undefined citations
Minor (can defer):
- Spacing tweaks, caption capitalization, color preferences
For each critical issue:
- Table overflow → adjust column widths, use
tabularxwithXcolumns - Missing figures → test TikZ in standalone mode, simplify
- Wikilinks → fix regex in build script's preprocessing step
- Undefined citations → add entries to
refs.bib
Recompile after fixes. Gate: no Critical issues → Phase 4.
Phase 4: Figure/Image Review
For each TikZ figure:
- Does it render correctly?
- Is the caption present and descriptive?
- Is sizing appropriate (
\resizebox{\textwidth}{!}{...})? - Is placement correct (
[H]float specifier)?
Test problematic TikZ in isolation:
\documentclass[tikz]{standalone}
\usepackage{tikz}
\begin{document}
% TikZ code here
\end{document}
Gate: all figures correct → Phase 5.
Phase 5: Git Management
Use latex-project-manager push for structured git operations.
Files to include in repo:
main.tex,<project>/content.tex,<project>/sections/*.tex<project>/refs.bib<project>/figures/*.gitignorebuild_latex.py,build_and_compile.sh(if applicable)
Files to exclude:
- Original
.mdObsidian source (stays in Obsidian vault only) .obsidian/directory- LaTeX build artifacts (
.aux,.log,.out,.toc,.bbl,.blg) - PDF files (compiled on-demand)
.gitignore template:
*.aux
*.log
*.out
*.toc
*.bbl
*.blg
*.synctex.gz
*.fls
*.fdb_latexmk
*.pdf
.DS_Store
Git authentication:
- GitHub:
$GITHUB_TOKENenv var or user-provided token - Overleaf:
$OVERLEAF_TOKENenv var (requires Premium plan for git access)
Common Issues & Fixes
| Issue | Fix |
|---|---|
| Korean text missing | Verify Noto Sans CJK KR installed, check fc-list | grep Noto |
| Overfull hbox | Increase \tolerance, \emergencystretch, reword long lines |
| Table overflow | Use p{2cm} or X columns, reduce content |
| Broken tcolorbox | Check \tcbuselibrary{most} is loaded |
| Undefined citations | Add missing keys to refs.bib, rerun bibtex |
| Mermaid not replaced | Check regex pattern in build script |
pgfplots \\ in labels |
Use {Label Text} with align=center instead |
$\to$ in \legend |
Use \textrightarrow{} instead |
| Inline thebibliography | Convert to refs.bib + \bibliography{} |
English Version Generation
For bilingual projects, create a separate English build:
- Translate MD content (keep same structure)
- Use English-specific preamble (no CJK fonts needed, use standard
\usepackage[T1]{fontenc}) - Generate
survey_main_EN.tex→survey_main_EN.pdf - Both versions share
refs.bib
Related Skills
iterative-academic-writing— Phase 1 content evaluationlatex-project-manager— Phase 5 project structuring and git pushpdf— General PDF manipulation (merge, split, forms)
More from iamseungpil/claude-for-dslab
update-study
This skill should be used when the user asks to "update study", "analyze new experiments", "update experiment document", or "refresh study notes". Produces academic-paper-quality experiment reports with matplotlib plots, executive summary with comparison tables, implementation structure, experimental results with figure interpretation, proposed improvements with code examples, hypotheses, limitations, and LaTeX PDF export with figures. Features incremental detection (only analyze NEW experiments), data extraction to DataFrame, automated plot generation, iterative writing improvement loop with quality criteria, zero-hallucination verification, and LaTeX PDF export. Usage - `/update-study logs/experiment.log study.md` or `/update-study "logs/exp1.log logs/exp2.log" results/ablation_study.md`
55paper-digest
Generate shareable paper summaries for Discord/Slack/Twitter. Use when user provides arxiv paper(s) and wants a digestible summary to share. Triggers on phrases like "논문 요약", "paper summary", "share this paper", "디스코드에 공유", "summarize for sharing". Produces insight-centered single-paragraph summaries that explain WHY research matters, not just WHAT it does.
29hwpx
Comprehensive HWPX (Korean Hancom Office) document creation, editing, and analysis. When Claude needs to work with Korean word processor documents (.hwpx files) for: (1) Reading and extracting content, (2) Creating new documents, (3) Modifying or editing content, (4) Extracting tables to CSV, (5) Modifying tables or table cells, or any other HWPX document tasks. MANDATORY TRIGGERS: hwpx, hwp, 한글, 한컴, Hancom, Korean document
18survey-paper
|
10iterative-code-review
Iteratively improve code quality by using task-planner-analyzer for planning, modular-code-architect agent to fix issues, code-reviewer agent to validate quality, and running tests to verify correctness. Use when implementing new features, after bug fixes, during refactoring, or when preparing code for production deployment. Loops until code-reviewer reports no critical issues AND tests pass.
10codex-iterative-solver
Collaborate with Codex CLI to iteratively analyze, plan, and solve complex coding problems through multiple rounds of expert feedback. Use when analyzing complex codebases with multiple approaches, validating implementation plans, or solving problems that benefit from expert validation.
8