paper-creator

SKILL.md

Scientific Paper Creator

Write publication-quality scientific papers in LaTeX with a test-driven workflow: every section is compiled and verified before moving to the next. The paper is never in a broken state.

Bundled Resources

This skill includes ready-to-use scripts and templates. Paths are relative to the skill directory (find it via glob("**/KORTIX-paper-creator/")):

Resource Path Purpose
LaTeX template assets/template.tex Minimal IMRaD template -- copy to main.tex
BibTeX generator scripts/openalex_to_bibtex.py Convert OpenAlex API JSON to .bib entries
Compiler scripts/compile.sh pdflatex + bibtex pipeline with error reporting
Verifier scripts/verify.sh TDD verification suite (12 checks)

The TDD Rule

After writing every section:

1. WRITE section → sections/{name}.tex
2. COMPILE: bash compile.sh paper/{slug}/main.tex
3. VERIFY:  bash verify.sh paper/{slug}/
4. If FAIL → FIX → go to 2
5. If PASS → move to next section

The paper must compile and pass verification at every step. Never batch errors. Never skip verification. This is non-negotiable.

Pipeline Overview

Phase 1: SCAFFOLD  →  Create project, copy template, initialize empty sections
                       Verify: compiles to valid (empty) PDF ✓

Phase 2: CITE      →  Search literature (OpenAlex), build references.bib
                       Verify: bibliography builds without errors ✓

Phase 3: WRITE     →  Per-section composition in writing order
                       Verify: compile + verify after EACH section ✓

Phase 4: POLISH    →  Self-reflection pass, strict verification, final build
                       Verify: verify.sh --strict passes with zero warnings ✓

Filesystem Architecture

paper/{paper-slug}/
├── main.tex                    # Master file (\input sections)
├── references.bib              # BibTeX bibliography
├── sections/
│   ├── abstract.tex
│   ├── introduction.tex
│   ├── related-work.tex
│   ├── methods.tex
│   ├── results.tex
│   ├── discussion.tex
│   └── conclusion.tex
├── figures/                    # .pdf/.png figures (+ generation scripts)
├── data/                       # Supporting data, CSVs (optional)
└── build/                      # Compilation output
    └── main.pdf

Phase 1: Scaffold

1a. Create project structure

SLUG="paper-slug-here"
mkdir -p "paper/$SLUG"/{sections,figures,data,build}

1b. Copy and adapt the template

SKILL_DIR=$(find . -path "*/KORTIX-paper-creator/assets/template.tex" -exec dirname {} \; | head -1 | sed 's|/assets||')
cp "$SKILL_DIR/assets/template.tex" "paper/$SLUG/main.tex"

Edit main.tex: set \title{}, \author{}, and adapt \documentclass if targeting a specific venue:

Venue Document class Notes
General / arXiv \documentclass[11pt,a4paper]{article} Default in template
IEEE conference \documentclass[conference]{IEEEtran} Remove geometry package
ACM conference \documentclass[sigconf]{acmart} Remove geometry, use acmart bib style
Springer LNCS \documentclass{llncs} Remove geometry, use splncs04 bib style

1c. Initialize section stubs

Create each section file with a section header and TODO marker:

for sec in abstract introduction related-work methods results discussion conclusion; do
    SECTION_TITLE=$(echo "$sec" | sed 's/-/ /g; s/\b\(.\)/\u\1/g')
    if [ "$sec" = "abstract" ]; then
        echo '\begin{abstract}' > "paper/$SLUG/sections/$sec.tex"
        echo "% TODO: Write abstract" >> "paper/$SLUG/sections/$sec.tex"
        echo '\end{abstract}' >> "paper/$SLUG/sections/$sec.tex"
    else
        echo "\\section{$SECTION_TITLE}" > "paper/$SLUG/sections/$sec.tex"
        echo "\\label{sec:$sec}" >> "paper/$SLUG/sections/$sec.tex"
        echo "% TODO: Write $sec" >> "paper/$SLUG/sections/$sec.tex"
    fi
done

1d. Initialize empty bibliography

echo "% References for $SLUG" > "paper/$SLUG/references.bib"
echo "% Generated by openalex_to_bibtex.py and manual entries" >> "paper/$SLUG/references.bib"

1e. VERIFY: First green state

bash "$SKILL_DIR/scripts/compile.sh" "paper/$SLUG/main.tex"
bash "$SKILL_DIR/scripts/verify.sh" "paper/$SLUG/"

The empty paper must compile cleanly. This is your baseline. Every subsequent change maintains this green state.

Phase 2: Literature & Citations

2a. Search for papers

Load the openalex-paper-search skill for OpenAlex API reference. Search strategy:

# Seminal/highly-cited papers
curl -s "https://api.openalex.org/works?search=YOUR+TOPIC&filter=cited_by_count:>50,type:article,has_abstract:true&sort=cited_by_count:desc&per_page=15&select=id,display_name,publication_year,cited_by_count,doi,authorships,abstract_inverted_index&mailto=agent@kortix.ai"

# Recent work (last 2-3 years)
curl -s "https://api.openalex.org/works?search=YOUR+TOPIC&filter=publication_year:>2023,type:article&sort=publication_date:desc&per_page=15&mailto=agent@kortix.ai"

# Review/survey papers
curl -s "https://api.openalex.org/works?search=YOUR+TOPIC&filter=type:review&sort=cited_by_count:desc&per_page=10&mailto=agent@kortix.ai"

2b. Generate BibTeX entries

Pipe OpenAlex results through the converter:

curl -s "https://api.openalex.org/works?search=YOUR+TOPIC&per_page=20&mailto=agent@kortix.ai" | \
    python3 "$SKILL_DIR/scripts/openalex_to_bibtex.py" >> "paper/$SLUG/references.bib"

For non-OpenAlex sources (web pages, books, reports), add manual BibTeX entries:

@misc{authorYYYYtitle,
  title = {Page Title},
  author = {Author Name},
  year = {2024},
  url = {https://example.com/page},
  note = {Accessed: 2024-01-15}
}

@book{authorYYYYbook,
  title = {Book Title},
  author = {First Author and Second Author},
  year = {2024},
  publisher = {Publisher Name},
  edition = {2nd}
}

2c. VERIFY: Bibliography builds

bash "$SKILL_DIR/scripts/compile.sh" "paper/$SLUG/main.tex"
bash "$SKILL_DIR/scripts/verify.sh" "paper/$SLUG/"

Check: zero BibTeX errors, all .bib entries parse correctly.

Phase 3: Per-Section Writing

Writing Order

Write sections in this order (each one builds on context from the previous):

  1. Abstract (draft) -- establishes scope, will be revised last
  2. Introduction -- context, problem, gap, contribution
  3. Methods -- what you did and how
  4. Results -- what you found
  5. Discussion -- what it means, limitations
  6. Related Work -- needs full paper context to position contribution
  7. Conclusion -- recap, future work
  8. Abstract (final) -- revise to match actual paper content

The Writing Loop (for each section)

1. READ all previously written sections to build context
2. WRITE the section following the guidance below
3. COMPILE + VERIFY
4. SELF-REFLECT: re-read critically
   - Every factual claim has a \cite{}?
   - No filler words or empty hedging?
   - Logical flow between paragraphs?
   - Consistent with other sections?
   - Quantitative where possible?
5. REVISE if needed → COMPILE + VERIFY again
6. Section DONE → next

Section-Specific Guidance

Abstract (150-300 words)

Structure: Context (1-2 sentences) → ProblemApproachKey results (quantitative!) → Implications.

\begin{abstract}
[Context sentence setting the broad area.]
[Problem: what gap or challenge exists.]
[Approach: what this paper does -- "We propose/present/introduce..."]
[Results: key quantitative findings -- "Our method achieves X\% on Y, outperforming Z by N\%."]
[Implications: why it matters.]
\end{abstract}

Rules: No citations in abstract. No undefined acronyms. Every number must appear in the actual results.

Introduction

Funnel structure: broad context → specific problem → gap in existing work → your contribution → paper outline.

\section{Introduction}
\label{sec:introduction}

% Paragraph 1: Broad context (2-3 sentences)
% Paragraph 2: Narrow to specific problem
% Paragraph 3: What's been tried, what's missing (the gap)
% Paragraph 4: Your contribution -- be specific and concrete
%   "Our contributions are as follows:
%   \begin{itemize}
%     \item We propose...
%     \item We demonstrate...
%     \item We release...
%   \end{itemize}"
% Paragraph 5: Paper outline
%   "The remainder of this paper is organized as follows. \secref{sec:related-work} reviews..."

Rules: End with explicit contribution list. Every contribution claimed here must be backed by evidence in Results/Discussion.

Methods

Reproducible detail. Another researcher should be able to replicate your work from this section alone.

\section{Methods}
\label{sec:methods}

% Define notation before using it:
% "Let $\mathcal{D} = \{(x_i, y_i)\}_{i=1}^{N}$ denote the training set..."

% Algorithms as pseudocode:
% \begin{algorithm}[t]
%   \caption{Algorithm Name}
%   \label{alg:name}
%   \begin{algorithmic}[1]
%     \REQUIRE Input description
%     \ENSURE Output description
%     \STATE Step 1
%     \FOR{condition}
%       \STATE Step in loop
%     \ENDFOR
%     \RETURN result
%   \end{algorithmic}
% \end{algorithm}

Rules: Define all notation. Be precise about dimensions, loss functions, optimization details. Use consistent math notation throughout (define \newcommand macros in main.tex preamble).

Results

Lead with the main finding. Then support with details.

\section{Results}
\label{sec:results}

% Paragraph 1: Main result (the headline number)
% "Our method achieves X\% accuracy on Y benchmark, representing a Z\% improvement
%  over the previous state of the art~\citep{baseline2024}."

% Table with main results:
% \begin{table}[t]
%   \centering
%   \caption{Main results on [benchmark]. Bold = best, underline = second best.}
%   \label{tab:main-results}
%   \begin{tabular}{lcc}
%     \toprule
%     Method & Metric 1 & Metric 2 \\
%     \midrule
%     Baseline A~\citep{ref} & 82.3 & 91.1 \\
%     Baseline B~\citep{ref} & 84.7 & 92.4 \\
%     \textbf{Ours} & \textbf{87.2} & \textbf{94.1} \\
%     \bottomrule
%   \end{tabular}
% \end{table}

% Subsequent paragraphs: ablation studies, analysis, per-dataset breakdown
% Every figure and table MUST be referenced: "As shown in \figref{fig:x}..."

Rules: Every figure/table referenced in text at least once. Report statistical significance or confidence intervals where applicable. No cherry-picking -- report all metrics, including where you don't win.

Discussion

Interpretation, not repetition. Don't restate results -- explain what they mean.

\section{Discussion}
\label{sec:discussion}

% Paragraph 1: Interpret main findings -- what do the results tell us?
% Paragraph 2: Compare with prior work -- how do results relate to existing literature?
% Paragraph 3: Limitations -- be honest and specific
%   "Our approach has several limitations. First, ... Second, ..."
% Paragraph 4: Broader implications -- what does this enable?
% Paragraph 5: Future work -- concrete directions, not vague hand-waving

Rules: Limitations must be genuine, not token gestures. Future work should be specific enough to be actionable.

Related Work

Written after Methods/Results/Discussion so you can precisely position your contribution against the literature.

\section{Related Work}
\label{sec:related-work}

% Organize THEMATICALLY, not chronologically:
% \subsection{Theme A}
% "Several works have addressed ... \citep{a,b,c}. Specifically, A does X, B does Y.
%  In contrast, our approach differs by Z."
%
% \subsection{Theme B}
% "Another line of work focuses on ... \citep{d,e,f}.
%  While these methods ..., they do not address ..., which is the focus of our work."

Rules: For each group of related work, explicitly state how your work differs. Don't just list papers -- synthesize and contrast. Every paper cited here should be in references.bib.

Conclusion

No new information. Brief recap of contribution and results, then future directions.

\section{Conclusion}
\label{sec:conclusion}

% Paragraph 1: What we did and what we found (3-4 sentences max)
% "In this paper, we presented [method] for [problem].
%  Our approach [key insight]. Experiments on [benchmarks] demonstrate [main result]."
%
% Paragraph 2: Future directions (2-3 sentences)
% "In future work, we plan to explore ... and extend our approach to ..."

Rules: Claims here must match what's in Results. Don't overclaim. 0.5-1 page max.

LaTeX Conventions

Enforce these throughout all sections:

Cross-references (defined as macros in template):

\figref{fig:architecture}    % → "Figure 1"  (via cleveref)
\tabref{tab:main-results}    % → "Table 2"
\eqnref{eq:loss}             % → "Equation 3"
\secref{sec:methods}         % → "Section 4"

Citations:

\citet{smith2024}            % → "Smith et al. (2024)"     (textual -- subject of sentence)
\citep{smith2024}            % → "(Smith et al., 2024)"    (parenthetical)
\citep{smith2024,jones2023}  % → "(Smith et al., 2024; Jones et al., 2023)"

Use \citet when the author is the grammatical subject: "\citet{smith2024} showed that...". Use \citep for parenthetical: "Recent work has shown significant gains \citep{smith2024}."

Figures:

\begin{figure}[t]
  \centering
  \includegraphics[width=\linewidth]{figure-name}
  \caption{Descriptive caption. Best viewed in color.}
  \label{fig:figure-name}
\end{figure}

Tables:

\begin{table}[t]
  \centering
  \caption{Caption ABOVE the table (convention).}
  \label{tab:table-name}
  \begin{tabular}{lcc}
    \toprule
    Column 1 & Column 2 & Column 3 \\
    \midrule
    Row 1 & data & data \\
    Row 2 & data & data \\
    \bottomrule
  \end{tabular}
\end{table}

Math notation: Define reusable macros in the preamble of main.tex:

\newcommand{\loss}{\mathcal{L}}
\newcommand{\model}{\mathcal{M}}
\newcommand{\dataset}{\mathcal{D}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\E}{\mathbb{E}}
\newcommand{\argmin}{\operatorname{argmin}}

Phase 4: Polish & Final Verification

4a. Self-Reflection Pass

After all sections are written, re-read the entire paper and check:

  1. Consistency: Same terminology throughout? Abstract claims match Results numbers?
  2. Redundancy: Any paragraph repeated across sections? Cut ruthlessly.
  3. Flow: Smooth transitions between sections? Each section starts by connecting to the previous?
  4. Claims: Every claim in Intro/Abstract backed by evidence in Results?
  5. Notation: Math notation consistent? All symbols defined before first use?

4b. Strict Verification

bash "$SKILL_DIR/scripts/verify.sh" "paper/$SLUG/" --strict

The --strict flag treats warnings as failures. All 12 checks must pass:

  • PDF exists and is non-trivial size
  • Zero LaTeX errors
  • Zero undefined \ref{} references
  • Zero undefined \cite{} citations
  • Zero BibTeX errors
  • Zero overfull boxes > 10pt
  • All referenced figure files exist
  • All sections have content (no empty stubs)
  • All .bib entries cited at least once (no dead refs)
  • Zero TODO/FIXME comments remaining
  • Abstract word count 100-300
  • All figures/tables referenced in text

4c. Final Build

bash "$SKILL_DIR/scripts/compile.sh" "paper/$SLUG/main.tex" --clean

The --clean flag removes auxiliary files, leaving only the PDF. Deliverable: paper/{slug}/build/main.pdf.

Figure Generation

When the paper needs figures, generate them as PDF or high-res PNG and save to figures/. Keep the generation script alongside for reproducibility.

matplotlib (recommended for most plots)

import matplotlib.pyplot as plt
import matplotlib
matplotlib.rcParams.update({
    'font.size': 12,
    'font.family': 'serif',
    'figure.figsize': (6, 4),
    'figure.dpi': 150,
    'savefig.dpi': 300,
    'savefig.bbox': 'tight',
    'axes.grid': True,
    'grid.alpha': 0.3,
})

fig, ax = plt.subplots()
# ... plot data ...
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.legend()
fig.savefig('paper/{slug}/figures/plot-name.pdf')
plt.close()

Rules:

  • Always save as PDF for vector quality (PNG only for raster images like photos/heatmaps)
  • Font size >= 10pt (readable when scaled to column width)
  • Save the plotting script to figures/plot_name.py for reproducibility
  • Use consistent color palettes across all figures
  • Include axis labels, legends, and titles

Diagrams and Architecture Figures

For architecture/flow diagrams, either:

  • Generate with matplotlib/tikz and save as PDF
  • Use image-gen tool and save to figures/
  • Hand-draw in any tool and export as PDF/PNG

LaTeX Troubleshooting

Common errors and fixes:

Error Cause Fix
Missing $ inserted Math symbol outside math mode Wrap in $...$
Undefined control sequence Typo in command or missing package Check spelling, add \usepackage
File not found Wrong path in \includegraphics Check \graphicspath and filename
Citation undefined Key mismatch between \cite and .bib Check exact key spelling
Missing \begin{document} Error in preamble Check for typos above \begin{document}
Overfull \hbox Line too wide Rephrase, add \linebreak, or use \sloppy locally
Package clash Two packages conflict Check compatibility, load order matters

When compile fails: read the first error in the log (subsequent errors often cascade from the first). Fix it, recompile. Don't try to fix multiple errors at once.

Scientific Writing Standards

Quick-reference checklist for quality prose:

  • Active voice: "We propose X" not "X is proposed"
  • Strong verbs: "achieves" not "is able to achieve", "reduces" not "leads to a reduction in"
  • Quantitative: "improves by 12%" not "significantly improves"
  • Define before use: Every acronym, symbol, and technical term defined on first use
  • Parallel structure: Lists and comparisons use the same grammatical form
  • No orphan pronouns: "This approach" not "This" alone at the start of a sentence
  • Hedging (appropriate): "suggests" / "indicates" for uncertain claims; "demonstrates" / "establishes" for proven ones
  • Person: "We" for authors' actions; passive voice sparingly for general facts; never "I" in multi-author papers
  • Tense: Present for established facts and paper structure ("Section 3 presents..."); past for what you did ("We trained the model on...")
  • Citations as nouns: "\citet{smith2024} showed..." (not "It was shown in \citep{smith2024}...")
Weekly Installs
3
GitHub Stars
2
First Seen
13 days ago
Installed on
cline3
github-copilot3
codex3
kimi-cli3
gemini-cli3
cursor3