citation-diversifier

SKILL.md

Citation Diversifier (budget-as-constraints) [NO NEW FACTS]

Purpose: fix a common survey failure mode:

  • the draft reads under-cited (or reuses the same few citations everywhere)
  • the pipeline fails the global unique-citation gate

This skill does not change prose by itself. It produces a constraint sheet: output/CITATION_BUDGET_REPORT.md.

Inputs

  • output/DRAFT.md
  • outline/outline.yml (H3 ids/titles; used to allocate budgets per subsection)
  • outline/writer_context_packs.jsonl (source of allowed_bibkeys_{selected,mapped,chapter,global} per H3)
  • citations/ref.bib

Output

  • output/CITATION_BUDGET_REPORT.md

Non-negotiables (NO NEW FACTS)

  • Only propose citation keys that exist in citations/ref.bib.
  • Only propose keys that are in-scope for the target H3 (prefer subsection-first scope; use chapter/global only when truly cross-cutting).
  • Do not propose “padding citations” that would require adding new claims or new numbers.

What a good budget report looks like (contract)

The report should feel like a constraint sheet, not a random list:

  • It states the blocking policy target and the gap-to-target (how many unique keys are missing; policy default is recommended).
  • For each H3, it proposes a scope-safe budget sized to actually close the gap:
    • small gaps: 3-6 keys / H3 is often enough
    • A150++ gaps: plan for ~6-12 keys / H3 (and avoid duplicates across H3 budgets)
  • It gives placement guidance (where in the subsection those keys can be embedded without adding new facts).

Canonical (parseable) lines required (downstream validators depend on these):

  • The target is derived from queries.md:citation_target (recommended by default for A150++).
  • - Global target (policy; blocking): >= <N> ...
  • - Gap: <K> (gap-to-target; if 0, injection can be a no-op PASS)

Optional (always reported; may be blocking depending on citation_target):

  • - Global recommended target: >= <N> ...
  • - Gap to recommended: <K>

Recommended prioritization (scope-safe):

  • allowed_bibkeys_selectedallowed_bibkeys_mappedallowed_bibkeys_chapter
  • Use allowed_bibkeys_global only for:
    • benchmarks/protocol papers
    • widely-used datasets/suites
    • cross-cutting surveys/method papers referenced across chapters

How this connects to writing (LLM-first)

After you generate the budget report:

  • Apply it using citation-injector (LLM edits to output/DRAFT.md, NO NEW FACTS).
  • Then run draft-polisher to remove any “budget dump voice” while keeping citation keys unchanged.

Important: citation-injector is LLM-first. Its script is validation-only.

Workflow

  1. Diagnose the global situation
  • Read output/DRAFT.md and estimate the “unique-key gap” (or use pipeline-auditor’s FAIL reason).
  1. Allocate budgets per H3 (scope-first)
  • Use outline/outline.yml to enumerate H3s in paper order.
  • For each H3, read its allowed key sets from outline/writer_context_packs.jsonl.
  • Pick a small set of unused keys that strengthen positioning without requiring new claims.
  1. Write output/CITATION_BUDGET_REPORT.md Required structure:
  • - Status: PASS|FAIL
  • - Global target (policy; blocking): >= <N> ...
  • - Gap: <K>
  • ## Summary (gap + strategy)
  • ## Per-subsection budgets (H3 id/title → suggested keys → placement hint)

Script (optional; deterministic report generator)

If you want a deterministic first-pass budget report, run the helper script. Treat it as a baseline and refine the plan as needed.

Quick Start

  • python .codex/skills/citation-diversifier/scripts/run.py --help
  • python .codex/skills/citation-diversifier/scripts/run.py --workspace workspaces/<ws>

All Options

  • --workspace <dir>
  • --unit-id <U###> (optional)
  • --inputs <semicolon-separated> (rare override; prefer defaults)
  • --outputs <semicolon-separated> (rare override; default writes output/CITATION_BUDGET_REPORT.md)
  • --checkpoint <C#> (optional)

Examples

  • Default IO:
    • python .codex/skills/citation-diversifier/scripts/run.py --workspace workspaces/<ws>

Done criteria

  • output/CITATION_BUDGET_REPORT.md exists and has actionable, in-scope budgets.
  • After applying the plan via citation-injector, pipeline-auditor no longer FAILs on global unique citations.
Weekly Installs
25
GitHub Stars
296
First Seen
Jan 23, 2026
Installed on
claude-code21
gemini-cli20
cursor18
opencode18
codex18
antigravity15