create-docs-list-script-and-refactor-docs by pgahq/agent-skills

Add an agent-friendly docs workflow to a repo, in a way that works across:

Ruby on Rails repos (via a rake task)
JS/TS repos (via npm/yarn/pnpm/bun scripts)
other setups (the same concept; different runner)

Important: the code snippets in this skill are examples, not drop-in guarantees. You must implement the approach in a way that fits the repo, then run it, debug it, and iterate until it works in that specific codebase.

Goal

Create a single, easy-to-run "docs inventory" command that:

lists all markdown docs (typically in docs/, including nested directories)
validates each doc's YAML frontmatter (at minimum: frontmatter exists and summary is present + non-empty)
prints read_when hints when present, so agents can quickly decide what to read next
reminds contributors/agents to read relevant docs before changing code

Where docs live

The repo must use docs/ as the primary (and eventually only) docs location.

If docs/ does not exist, create it.
If docs exist elsewhere (common: doc/, documentation/, guides/, wiki/, scattered README* files, etc.), migrate them into docs/ safely (see required rules below).

Migration rules (required)

Preflight impact analysis (required before moving/deleting anything):
- Identify all markdown docs that are not already under docs/ (and any non-markdown docs you plan to convert).
- Check how each doc path is used in the repo before you change it:
  - Search for references to the old path(s) across the whole repo (code, configs, CI, scripts, READMEs, other docs).
  - Pay special attention to path-sensitive integrations like doc generators, API schema loaders, linters, CI steps, and internal tooling.
  - For markdown docs, also check markdown links that will break when a file moves (relative links, image refs, intra-doc anchors).
- Decide how you will keep the repo correct after the move:
  - Update references to the new docs/... paths everywhere they appear.
  - If a doc is referenced externally or from places you can't safely update (e.g., known inbound links, published URLs, other repos), consider leaving behind a small stub at the old location pointing to the new location (or keeping a copy) until the ecosystem is updated.
Consolidate: move docs from other locations into docs/ so agents have one place to look (but only after the preflight analysis above).
Convert to markdown: non-markdown docs should be converted to *.md (and updated for readability).
Add frontmatter: every docs/**/*.md must have the required YAML frontmatter contract (summary, read_when recommended).
Organize with subdirectories when it helps: subdirectories are allowed (and encouraged) when documents are fundamentally different types. Examples:
- docs/architecture/ (system design, data flow)
- docs/runbooks/ (operations, on-call, deploys)
- docs/dev/ (local setup, lint/test, workflows)
- docs/api/ (API usage, schemas, integrations)
- docs/product/ (requirements, behavior, UX notes)

Avoid deep nesting unless it genuinely improves discoverability.

Migration completion rules (required)

After moving/converting docs, you must make the repo consistent:

Update all references: fix all repo-local links and references to moved docs (including markdown links in other docs/READMEs).
Fix any path-sensitive tooling: update configs/scripts/CI that referenced the old locations.
.gitignore hygiene (required):
- If the repo previously ignored legacy doc locations (e.g., doc/) because they were generated, verify whether that ignore is still correct after migration.
- Ensure the new canonical docs location (docs/) is not accidentally ignored by existing ignore rules.
- If you add any helper scripts (e.g. scripts/docs-list.*) or rake tasks, ensure they are tracked and not ignored.
- Only ignore generated artifacts (if any) rather than the source docs themselves.
Verify:
- Run the docs inventory command end-to-end (docs:list) and ensure it lists all expected docs and correctly validates frontmatter.
- Re-run your repo search for the old doc paths and ensure there are no remaining references (unless you intentionally kept stubs/redirect notes).

Required frontmatter contract

Each *.md doc should start with YAML frontmatter containing:

summary (required): one sentence describing what the doc covers
read_when (strongly recommended): a list of short "when to read this" hints (can be empty)

Example:

---
summary: One-sentence description of what this doc covers
read_when:
  - a short hint for when this doc matters (e.g. "editing routing", "adding tests", "touching GraphQL queries")
  - another hint
  - more hints
---

Add repo-specific entrypoint

Pick an entrypoint that matches the repo's tooling and name it consistently docs:list.

You must test this in the repo (run the command end-to-end and fix whatever breaks). Don't assume the examples below will work unchanged.

If the repo is JS/TS (npm/yarn/pnpm/bun)

Add a script named docs:list to package.json. The exact runner varies:

"docs:list": "node scripts/docs-list.js"

Other common variants:

"docs:list": "tsx scripts/docs-list.ts"

"docs:list": "bun scripts/docs-list.ts"

If the repo is Ruby on Rails

Add a rake task named docs:list, and run it with:

bundle exec rake docs:list (most common)
or rake docs:list (if bundler isn't used / already implied)

Then (optionally) add a convenience entry to bin/ or a Makefile, but the rake task is the reliable "native" Rails option.

Add `AGENTS.md` instruction (generic)

Add this to the repo root AGENTS.md (create AGENTS.md if it does not exist), using the repo's actual command. This instruction is required so agents consistently discover docs before editing code:

Before making changes in this repo, run the docs inventory command (`docs:list`) to understand how this repo works and which docs you should read first. If `docs:list` is missing, add it before proceeding.

Examples:
- JS/TS: `npm run docs:list` / `yarn docs:list` / `pnpm docs:list` / `bun run docs:list`
- Rails: `bundle exec rake docs:list`

Once you've finished implementing everything and explaining it to the user (with an excerpt / example of the results), offer to find and correct any issues with the docs.

================== Example implementation (plain Node.js): scripts/docs-list.js

Use this when the repo is JS but not TypeScript, or when you want to avoid requiring tsx.

This is an example implementation. After adding it, run it in the repo and adjust as needed (paths, exclusions, output format, Node version constraints).

#!/usr/bin/env node

const { readdirSync, readFileSync } = require('node:fs');
const { join, relative } = require('node:path');

process.stdout.on('error', (error) => {
  if (error && error.code === 'EPIPE') process.exit(0);
  throw error;
});

const DOCS_DIR = join(process.cwd(), 'docs');
const EXCLUDED_DIRS = new Set(['archive', 'research']);

function walkMarkdownFiles(dir, baseDir) {
  const entries = readdirSync(dir, { withFileTypes: true });
  const files = [];

  for (const entry of entries) {
    if (entry.name.startsWith('.')) continue;

    const fullPath = join(dir, entry.name);
    if (entry.isDirectory()) {
      if (EXCLUDED_DIRS.has(entry.name)) continue;
      files.push(...walkMarkdownFiles(fullPath, baseDir));
    } else if (entry.isFile() && entry.name.endsWith('.md')) {
      files.push(relative(baseDir, fullPath));
    }
  }

  return files.sort((a, b) => a.localeCompare(b));
}

function extractMetadata(fullPath) {
  const content = readFileSync(fullPath, 'utf8');

  if (!content.startsWith('---')) {
    return { summary: null, readWhen: [], error: 'missing front matter' };
  }

  const endIndex = content.indexOf('\n---', 3);
  if (endIndex === -1) {
    return { summary: null, readWhen: [], error: 'unterminated front matter' };
  }

  const frontMatter = content.slice(3, endIndex).trim();
  const lines = frontMatter.split('\n');

  let summaryLine = null;
  const readWhen = [];
  let collectingReadWhen = false;

  for (const rawLine of lines) {
    const line = rawLine.trim();

    if (line.startsWith('summary:')) {
      summaryLine = line;
      collectingReadWhen = false;
      continue;
    }

    if (line.startsWith('read_when:')) {
      collectingReadWhen = true;
      continue;
    }

    if (collectingReadWhen) {
      if (line.startsWith('- ')) {
        const hint = line.slice(2).trim();
        if (hint) readWhen.push(hint);
      } else if (line === '') {
        // allow blank lines
      } else {
        collectingReadWhen = false;
      }
    }
  }

  if (!summaryLine) {
    return { summary: null, readWhen, error: 'summary key missing' };
  }

  const summaryValue = summaryLine.slice('summary:'.length).trim();
  const normalized = summaryValue.replace(/^['"]|['"]$/g, '').replace(/\s+/g, ' ').trim();

  if (!normalized) {
    return { summary: null, readWhen, error: 'summary is empty' };
  }

  return { summary: normalized, readWhen };
}

console.log(`Listing all markdown files in ${relative(process.cwd(), DOCS_DIR)}/:`);

const markdownFiles = walkMarkdownFiles(DOCS_DIR, DOCS_DIR);
for (const relativePath of markdownFiles) {
  const fullPath = join(DOCS_DIR, relativePath);
  const { summary, readWhen, error } = extractMetadata(fullPath);

  if (summary) {
    console.log(`${relativePath} - ${summary}`);
    if (readWhen.length > 0) console.log(`  Read when: ${readWhen.join('; ')}`);
  } else {
    const reason = error ? ` - [${error}]` : '';
    console.log(`${relativePath}${reason}`);
  }
}

console.log(
  'Next steps: when your task matches any "Read when" hint above, read that doc before doing anything else, and suggest new coverage when it is missing.'
);

================== Example implementation (TypeScript / Node): scripts/docs-list.ts

This is an example implementation. After adding it, run it in the repo and adjust as needed (tsx availability, TypeScript config, module system, Node version constraints).

#!/usr/bin/env tsx

import { readdirSync, readFileSync } from 'node:fs';
import { join, relative } from 'node:path';

process.stdout.on('error', (error) => {
  if ((error as NodeJS.ErrnoException).code === 'EPIPE') {
    process.exit(0);
  }
  throw error;
});

const DOCS_DIR = join(process.cwd(), 'docs');

const EXCLUDED_DIRS = new Set(['archive', 'research']);

function compactStrings(values: unknown[]): string[] {
  const result: string[] = [];
  for (const value of values) {
    if (value === null || value === undefined) {
      continue;
    }
    const normalized = String(value).trim();
    if (normalized.length > 0) {
      result.push(normalized);
    }
  }
  return result;
}

function walkMarkdownFiles(dir: string, base: string = dir): string[] {
  const entries = readdirSync(dir, { withFileTypes: true });
  const files: string[] = [];
  for (const entry of entries) {
    if (entry.name.startsWith('.')) {
      continue;
    }
    const fullPath = join(dir, entry.name);
    if (entry.isDirectory()) {
      if (EXCLUDED_DIRS.has(entry.name)) {
        continue;
      }
      files.push(...walkMarkdownFiles(fullPath, base));
    } else if (entry.isFile() && entry.name.endsWith('.md')) {
      files.push(relative(base, fullPath));
    }
  }
  return files.sort((a, b) => a.localeCompare(b));
}

function extractMetadata(fullPath: string): {
  summary: string | null;
  readWhen: string[];
  error?: string;
} {
  const content = readFileSync(fullPath, 'utf8');

  if (!content.startsWith('---')) {
    return { summary: null, readWhen: [], error: 'missing front matter' };
  }

  const endIndex = content.indexOf('\n---', 3);
  if (endIndex === -1) {
    return { summary: null, readWhen: [], error: 'unterminated front matter' };
  }

  const frontMatter = content.slice(3, endIndex).trim();
  const lines = frontMatter.split('\n');

  let summaryLine: string | null = null;
  const readWhen: string[] = [];
  let collectingField: 'read_when' | null = null;

  for (const rawLine of lines) {
    const line = rawLine.trim();

    if (line.startsWith('summary:')) {
      summaryLine = line;
      collectingField = null;
      continue;
    }

    if (line.startsWith('read_when:')) {
      collectingField = 'read_when';
      const inline = line.slice('read_when:'.length).trim();
      if (inline.startsWith('[') && inline.endsWith(']')) {
        try {
          const parsed = JSON.parse(inline.replace(/'/g, '"')) as unknown;
          if (Array.isArray(parsed)) {
            readWhen.push(...compactStrings(parsed));
          }
        } catch {
          // ignore malformed inline arrays
        }
      }
      continue;
    }

    if (collectingField === 'read_when') {
      if (line.startsWith('- ')) {
        const hint = line.slice(2).trim();
        if (hint) {
          readWhen.push(hint);
        }
      } else if (line === '') {
        // allow blank lines inside the list
      } else {
        collectingField = null;
      }
    }
  }

  if (!summaryLine) {
    return { summary: null, readWhen, error: 'summary key missing' };
  }

  const summaryValue = summaryLine.slice('summary:'.length).trim();
  const normalized = summaryValue
    .replace(/^['"]|['"]$/g, '')
    .replace(/\s+/g, ' ')
    .trim();

  if (!normalized) {
    return { summary: null, readWhen, error: 'summary is empty' };
  }

  return { summary: normalized, readWhen };
}

console.log(`Listing all markdown files in ${relative(process.cwd(), DOCS_DIR)}/:`);

const markdownFiles = walkMarkdownFiles(DOCS_DIR);

for (const relativePath of markdownFiles) {
  const fullPath = join(DOCS_DIR, relativePath);
  const { summary, readWhen, error } = extractMetadata(fullPath);
  if (summary) {
    console.log(`${relativePath} - ${summary}`);
    if (readWhen.length > 0) {
      console.log(`  Read when: ${readWhen.join('; ')}`);
    }
  } else {
    const reason = error ? ` - [${error}]` : '';
    console.log(`${relativePath}${reason}`);
  }
}

console.log('Next steps: when your task matches any "Read when" hint above (React hooks, cache directives, database work, tests, etc.), read that doc before doing anything else, and suggest new coverage when it is missing. If nothing above matches your task, mention it to the user and offer to create or update a doc.')

console.log(
  '\nReminder: keep markdown docs up to date as code changes (either because you made changes or you observed code / behavior that was not consistent with the docs). Include frontmatter `summary` and `read_when` (a list of hints) for each *.md doc.'
)

================== Example implementation (Ruby / Rails): lib/tasks/docs_list.rake

This is a minimal rake task that mirrors the same contract (list docs, validate frontmatter).

This is an example implementation. After adding it, run it in the repo (typically bundle exec rake docs:list) and adjust as needed (Rails/Bundler expectations, docs directory layout, exclusions, output format).

# frozen_string_literal: true

namespace :docs do
  desc "List markdown docs and validate frontmatter"
  task :list do
    require "pathname"

    docs_dir = Pathname.new("docs")
    unless docs_dir.directory?
      puts "Missing required docs directory: docs/"
      exit 1
    end

    excluded_dirs = %w[archive research]

    md_files = docs_dir
      .glob("**/*.md")
      .reject { |p| p.each_filename.any? { |name| name.start_with?(".") } }
      .reject { |p| p.each_filename.any? { |name| excluded_dirs.include?(name) } }
      .sort

    puts "Listing all markdown files in #{docs_dir}/:"

    md_files.each do |path|
      content = path.read
      relative = path.relative_path_from(docs_dir).to_s

      unless content.start_with?("---")
        puts "#{relative} - [missing front matter]"
        next
      end

      end_index = content.index("\n---", 3)
      unless end_index
        puts "#{relative} - [unterminated front matter]"
        next
      end

      front_matter = content[3...end_index].strip
      summary_line = front_matter.lines.find { |l| l.strip.start_with?("summary:") }

      unless summary_line
        puts "#{relative} - [summary key missing]"
        next
      end

      summary = summary_line.split("summary:", 2).last.to_s.strip.gsub(/\A["']|["']\z/, "")
      if summary.empty?
        puts "#{relative} - [summary is empty]"
        next
      end

      puts "#{relative} - #{summary}"
    end

    puts %(Next steps: when your task matches any "Read when" hint above, read that doc before doing anything else, and suggest new coverage when it is missing.)
    puts %(Reminder: keep markdown docs up to date as code changes. Include frontmatter `summary` and `read_when` for each *.md doc.)
  end
end

create-docs-list-script-and-refactor-docs