migrate-content-ia
Migrate content IA (redirects + links + anchors)
Use this skill when pages move or rename under content/ and you must
preserve old public URLs and/or fix cross-references. Work in phases;
choose PR-scoped vs full-site mode per run.
Read first: CLAUDE.md / AGENTS.md (URL rules, vendored areas, external
links, special cases) and hugo.yaml (permalinks, refLinksErrorLevel,
disablePathToLower). For prose and link text, follow STYLE.md; for
components, front matter, and link examples, follow COMPONENTS.md.
Related skills: research helps map moves and find inbound links; write commits minimal edits. Run this skill’s phases after the move is identified (or in parallel with research for large IA work).
Agent: required procedure (do not skip)
Common mistake (wrong): use git diff main...HEAD (or the PR’s file
list) as the full set of places to fix links for a migration. That set shows
what moved; it is not the list of every page that points to a
moved page. Inbound stragglers are often in files the PR never touched. You
must still sweep the repo for every string in the old path and published-URL set
for this run, not only for “files in the diff.”
Definition of done (when the migration is finished): Both of the following (unless the user or AGENTS.md explicitly defers a List 2 item in Phase 3; document the deferral):
docker buildx bake validatepasses for the branch, with no new build/link errors from this work.- A sweep of the old path and published-URL set for this run (see
Sweep commands below) finds no remaining
migration-relevant inbound reference—including:
- links to an old source path (plain
.mdand equivalentrefforms), - links that use the old path and a
#fragment, - and, where your mapping includes them, old published-style
link:/url:/ full-site URL strings,
except intentional entries to keep: for examplealiaseson the new canonical page, or redirects.yml sources you must not edit per policy. (A hit on a source that is only analiasline on the new page is expected—do not “fix” that away; distinguish alias rows from straggler links in body or nav config.)
- links to an old source path (plain
Chaining (policy): when this branch’s content work is ready for handoff,
run the review-changes skill on
main...HEAD (or merge-base…HEAD for a different target branch) so
the whole branch is re-read for cross-page issues before opening a PR. Do
not treat phases 0–3 alone as the final check.
Run in order (mandatory for agents):
- Scope the moves (mapping input): set the Git range like review-changes
(for a PR to
main:git diff --name-only main...HEAD; for another target:BASE=$(git merge-base <target-branch> HEAD)thengit diff --name-only $BASE...HEAD, as in Phase 0.5). Include renames; build the old → new table (source and published) per Phase 0. - Sweep and list: for every old path/URL in that table, run
Sweep commands on the allowed trees. Record
every hit as List 1 (no
#) or List 2 (old path with#...) per Phase 0.5. - Phased edits: Phase 1 (
aliases), then Phase 2 (List 1), then Phase 3 (List 2) with no guessing—as in the sections below. - Re-sweep the same old-path set, then run
docker buildx bake validate. The Definition of done above is met or you have explicit defers for the remainder. - review-changes: run review-changes
on the branch vs
main…HEAD(or the correct base) before a PR.
Sweep commands
Use a repository search (e.g. rg / your IDE) so nothing in the
allowed scope is only eyeballed.
Trees to include (at minimum): all of content/, plus data/ and
layouts/ when a migration can appear in config, link:-like fields,
shortcodes, or hardcoded path strings. Follow Vendored / generated rules in
AGENTS.md; do not edit disallowed files.
What to search for (repeat per row in the old side of the mapping):
- Hugo / source form: path segments that identify the old file, e.g.
manuals/.../old-segment/...or../old-segment/.../page.mdas your tree uses; include variants that still appear in the repo. - Published / site form: e.g.
/admin/.../old-slug/in front matter, navurl:, orhttps://docs.docker.com/...in allowed files—match the file’s established pattern, per Conventions below. - Anchors: search for the old path string; matches that also include
#...belong on List 2 for Phase 3 unless the whole link is a pure path-only case.
scripts/scope-pr-files.sh (if present) prints
PR_SCOPE_FILES only—it does not replace this sweep. Use it to build
the old → new table, not to list where inbound links were fixed.
Progressive disclosure (optional)
The procedure below stays in this file. If a run produces a very large
old → new URL table, store that table in reference.md in this skill
directory and link it from the task summary, so the agent reads the long
mapping only when needed.
Modes
-
PR-scoped (typical for a single PR)
- What the PR “owns” (focus): use
git diff/base...HEADto know which pages and renames the branch actually moves (PR_SCOPE_FILES). The old → new mapping and List 1 / List 2 for this migration are defined from that work, not from unrelated areas. - Where to look for stale references (sweep): search broadly—typically all
of
content/(and config, shortcodes, layouts, per Conventions)—for inbound links and fields whose target is an old path or URL in this PR’s mapping. Inbound stragglers are often in files the PR never touched; finding them is in scope for this migration. - What to edit: update any file in the allowed trees that contains a
migration-relevant reference (target ∈ this PR’s old path set) according
to the phases below. Do not treat
PR_SCOPE_FILESas a hard limit on which files you may save for inbound link repairs (unless project policy for a given PR says otherwise; then follow policy and defer out-of-PR file fixes). - Out of scope (defer / ignore in this run): link and anchor problems that
are not about this PR’s old→new map—e.g. a different area’s own slug
issues, rot unrelated to the remapped path set. Example: a PR that only
remaps
content/strawberry/...should not “fix the whole site”; it should still fix a link undermango/…that points at an oldstrawberry/…path in the mapping, and should not chase mango/-only issues that do not involve those old targets.
- What the PR “owns” (focus): use
-
Full-site (complete migration after the PR)
- Update stragglers across the repo (or all inbound links to moved
sections), including config-driven
link:fields if policy allows. - Still make minimal edits; no drive-by rewrites to unrelated targets outside the run’s declared mapping and lists.
- Update stragglers across the repo (or all inbound links to moved
sections), including config-driven
No guessing
- The agent must not guess replacement paths, published URLs, or fragment
IDs (including for consolidated pages, renamed headings, or
“semantic” remaps of
#anchor→ new#…). If the user has not given an explicit new target, ask, defer, or stop per AGENTS.md; never infer, autocomplete, or substitute a plausible fragment from the target page’s heading list. That rule applies in every phase, including after validation in Phase 3.
Conventions (links, anchors, redirects)
Front matter aliases (redirects)
- Per COMPONENTS.md,
aliasesare URLs that redirect to this page. - Add or merge on the new canonical page; do not drop unrelated
entries. Match local examples: published-style paths (leading
/), and trailing/when that matches existing pages in the same area. - No speculative redirects for URLs that were never published.
- Collision check before adding: no other page or redirect may already own the same old path.
- If the site also uses
data/redirects.yml, only add entries when project policy requires it; avoid duplicating the same old URL inaliasesandredirects.ymlunless maintainers do.
Internal links in Markdown (STYLE.md + COMPONENTS.md)
- Use relative paths to source files (e.g.
../section/page.md) with.md, following COMPONENTS.md examples, unless the file already uses an established pattern (e.g. somelink:or nav fields use published paths withoutmanualsor.md— match the surrounding file). - Keep CLAUDE.md / AGENTS.md rules: internal ref targets under
content/manuals/...often use the full/manuals/...path; published URLs omit themanualssegment—do not confuse the two when fixing links. - Link text (STYLE.md): descriptive, ~5 words; no “click here” or “learn more”; no end punctuation inside the link text; no bold/italic on link text unless normal in the sentence.
- Headings (STYLE): sentence case; do not rename headings in passing unless the migration requires it (heading changes break fragments).
Shortcodes and layouts (links not only in Markdown)
- Phase 2–3 scope includes any shortcode or layout partial (under
Modes, search broadly for inbound links to the migration; edits follow
the same file-level rules as for Markdown) that emits links: e.g.
ref/relref,linkfields in shortcode args, or hardcodeddocs.docker.com/ path strings. Grep for old paths, slugs, and fragments underlayouts/shortcodes/(andlayouts/_default/if partials build nav). - Match each file’s existing pattern; do not rewrite working shortcode style just to “clean up.”
Fragments / anchors (Phase 3)
- List 1 / List 2: fragment-bearing cross-references to old paths are tracked on List 2 in Phase 0.5; do not bulk-rewrite them in the List 1 pass (Phase 2). See Phase 0.5 and Phase 2.
- Valid
#fragmentvalues: after the user supplies a new fragment, it should match the target page’s generated heading ID (Hugo slugification; see CLAUDE.md / AGENTS.md). The agent still validates (see Phase 3) and must not “pick” a different id from the page to replace a bad answer—No guessing. - Same-page:
[Text](#section-id). - Cross-page: when user-provided,
#fragmentmust still be checked against the target file. Validate fragments in shortcodes the same way as in body Markdown.
External URLs (AGENTS.md)
- Do not commit guessed replacement URLs. If a URL cannot be verified, treat as blocked or drop the fragment per AGENTS guidance. See also No guessing above; internal and external link targets are treated the same for inference: none without user input or a verified source.
Special cases (AGENTS.md)
- Engine API version pages: respect coordinated
/latest/aliasesrules—never leave two version files both owning/latest/. - Vendored / generated trees: read-only; see CLAUDE.md. Do not “fix” links there if policy forbids.
Phase 0 — Discovery (read-only; may use whole repo)
- Read hugo.yaml (permalinks,
refLinksErrorLevel,disablePathToLower). - From the branch (diff, renames), build a mapping table:
- old source path → new source path
- old published URL → new published URL (from permalink rules)
- Case: with
disablePathToLower: true, filesystem path case appears in URLs—directory and link casing must match (e.g.setupvsSetup). - When planning inbound link fixes, treat old-path references as two
categories: no fragment vs with
#fragment. That split feeds List 1 and List 2 in Phase 0.5 and drives Phase 2 ordering (see there).
Phase 0.5 — PR-scoped evaluation (required before edits in PR mode)
-
Set
PR_SCOPE_FILES(Git scope for PR mode)- When the PR targets
main, use the same triple-dot form as review-changes:
git diff --name-only main...HEAD - For a different target branch or a custom base, use the merge base:
BASE=$(git merge-base <target-branch> HEAD)
then:
git diff --name-only "$BASE"...HEAD - Those paths define what moved in the branch; they are the primary input
to the old → new path/URL table. They are not a hard cap on where
to search for inbound links (see Modes): sweeps for links to old
paths usually cover all of
content/(and other trees per Conventions). - If project policy limits edits to the diff for a given PR, follow that and defer link fixes in files outside the diff; note the exception in the task if the user relaxes that policy.
- When the PR targets
-
Build checklists (see Modes for sweep vs area-of-work):
- path/URL mapping this run must honor (old source path → new; old published → new, from the PR’s moves in PR-scoped mode, or the declared full migration in full-site mode)
- List 1 — old path, no fragment: every inbound reference, found on
the sweep surface, to a moved old path that does not include a
#...fragment (e.g.…/banana.mdin the repo’s link style for that file). - List 2 — old path with fragment: every inbound reference, found on
the same sweep, to a moved old path that includes a
#...fragment (e.g.…/banana.md#anchoror the published-style equivalent in context). The same old path string may appear on both List 1 and List 2 for different links; duplication across the two lists is OK. - Matching rules: when recording List 1 / List 2, use one consistent
path representation for comparison (e.g. relative
../path/banana.mdvs root-anchored) per the conventions in this doc and the surrounding file’s established pattern. Agents compare and skip List 2 links in the List 1 pass using the same representation rules.
-
Out of scope for the lists: only include references whose old target is in this run’s mapping. Do not build List 1/2 for unrelated mango/ (or other) problems unless those links also target an old path that this migration renames. Defer those issues separately (see Modes).
Phase 1 — aliases (old published URLs)
- On each new canonical page, add or merge
aliasesfor every real former public URL. - Do not strip existing unrelated aliases.
- PR-scoped: add aliases only where the canonical file is in scope or the project requires it; otherwise list missing alias targets for follow-up.
Phase 2 — In-repo link reference updates
- List 1 first (path only): update references that belong to List 1
(old path, no fragment). Replace old source paths or old published URLs
with the new targets; preserve each file’s link pattern (relative vs
root-anchored
.mdpaths). Do not apply the same bulk path replacement to links that appear in List 2 (old path with#...) during this sub-step—leave every List 2 link unchanged for now. - After List 1 is complete: re-scan the same sweep surface as
in Phase 0.5 (e.g. all of
content/plus config) or print a clear list of all remaining List 2 entries. Those links should still point at the old path and old fragment until Phase 3. - Full-site (extra sweep): after steps 1–2, still use AGENTS “Page
deletion checklist”-style thoroughness for config / front matter
link:and similar so nav and grids are not left on old slugs. Apply the List 1 / List 2 rules there too: path-only old references first; defer fragment-bearing rewrites in line with List 2 until Phase 3. - PR-scoped (which files to change): apply List 1 and later Phase 3 updates
to every file the sweep finds with a migration-relevant reference
(inbound to an old path in the mapping), including files not in
PR_SCOPE_FILES, per Modes. Log and defer (do not “fix”) unrelated stragglers. If policy forbids out-of-PR file edits, defer per step 1 of Phase 0.5. - Include shortcodes and layout partials (see Conventions and Modes for sweep vs focus).
Phase 3 — List 2: interactive path and fragment resolution
Prerequisites: Phase 2 has updated List 1; List 2 still lists old
path + #... (unchanged) for this migration. See Modes for which files
may be edited; No guessing applies.
- Print List 2 to the user: every remaining old path +
#anchor(in the agreed representation), so nothing is hidden before the loop. - For each distinct
old-path#oldAnchor(or process in the order the user prefers, one at a time):- Ask: What is the new path (and fragment, if any) for this content? The
user may give a new source path, published URL, and/or
#newAnchorper project conventions. - Validate the user’s answer: open the target page (or resolve the
target) and check that
#newAnchor(if any) exists as a real heading / generated id on that page, per CLAUDE.md / AGENTS.md (same rules as the rest of the site). Do not replace the user’s fragment with a “better” one from the file. - If validation fails (unknown target file, or
#newAnchornot found on the page): warn clearly (what failed: path vs missing fragment), then ask again for a corrected path and/or fragment. Repeat until validation passes or the user defers / drops the fragment (per AGENTS.md). Never guess a new fragment to fix the problem. - When validation passes: update all in-repo references that match
that same
old-path#oldAnchorto the user-approvednew-path#newAnchor(respect each file’s link style; include shortcodes/layouts on the same sweep surface as Phase 2).
- Ask: What is the new path (and fragment, if any) for this content? The
user may give a new source path, published URL, and/or
- Repeat from step 1: re-print or re-scan for List 2 until it is empty or the user defers the remainder.
- PR-scoped / full-site: the loop is the same. Edits follow Modes: migration-relevant inbound links may live in any file on the sweep; do not expand into unrelated link debt from other areas. Defer as in Modes and Phase 0.5.
Optional: scripts helper
This skill includes a small scope helper so agents do not re-derive Git
recipes. See scripts/scope-pr-files.sh — it prints
paths in PR scope for a given target branch (default main).
Verification
docker buildx bake validate
Use the Definition of done in Agent: required procedure (do not skip)
as the final bar: validate must pass, and the sweep must be clean for
plain and #fragment old-path references, or the remainder must be
explicitly deferred in Phase 3 per AGENTS.md / the user. Mid-run,
Phase 2 may still leave List 2 links unchanged until Phase 3; that
intermediate state is not the finished migration.