understand-knowledge

Installation
SKILL.md

/understand-knowledge

Analyzes a Karpathy-pattern LLM wiki — a three-layer knowledge base with raw sources, wiki markdown, and a schema file — and produces an interactive knowledge graph dashboard.

What It Detects

The Karpathy LLM wiki pattern (see https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f):

  • Raw sources — immutable source documents (articles, papers, data files)
  • Wiki — LLM-generated markdown files with wikilinks ([[target]] syntax)
  • Schema — CLAUDE.md, AGENTS.md, or similar configuration file
  • index.md — content catalog organized by categories
  • log.md — chronological operation log

Detection signals: has index.md + multiple .md files with wikilinks. May have raw/ directory and schema file.

Instructions

Phase 1: DETECT

  1. Determine the target directory:

    • If the user provided a path argument, use that
    • Otherwise, use the current working directory
  2. Run the format detection script bundled with this skill:

    python3 <SKILL_DIR>/parse-knowledge-base.py <TARGET_DIR>
    
    • If the script exits with an error, tell the user this doesn't appear to be a Karpathy-pattern wiki and explain what was expected
    • If successful, proceed. The script writes scan-manifest.json to <TARGET_DIR>/.understand-anything/intermediate/
  3. Read the scan-manifest.json and announce the results:

    • "Detected Karpathy wiki: N articles, N sources, N topics, N wikilinks (N unresolved)"
    • List the categories found from index.md

Phase 2: SCAN (already done)

The parse script in Phase 1 already performed the deterministic scan. The scan-manifest.json contains:

  • Article nodes (one per wiki .md file) with extracted wikilinks, headings, frontmatter
  • Source nodes (one per raw/ file)
  • Topic nodes (from index.md section headings)
  • related edges (from wikilinks)
  • categorized_under edges (from index.md sections)

No additional scanning is needed. Proceed to Phase 3.

Phase 3: ANALYZE

Dispatch article-analyzer subagents to extract implicit knowledge:

  1. Read the scan-manifest.json to get the article list

  2. Prepare batches of 10-15 articles each, grouped by category when possible (articles in the same category are more likely to have implicit cross-references)

  3. For each batch, dispatch an article-analyzer subagent with:

    • The batch of articles (id, name, summary, wikilinks, category, content from knowledgeMeta)
    • The full list of existing node IDs (so the agent can reference them)
    • The batch number for output file naming
    • The intermediate directory path: $INTERMEDIATE_DIR = <TARGET_DIR>/.understand-anything/intermediate

    The agent will write analysis-batch-{N}.json to the intermediate directory.

  4. Run up to 3 batches concurrently. Wait for all batches to complete.

  5. If any batch fails, log a warning but continue — the scan-manifest provides a solid base graph even without LLM analysis.

Phase 4: MERGE

  1. Run the merge script bundled with this skill:

    python3 <SKILL_DIR>/merge-knowledge-graph.py <TARGET_DIR>
    
  2. The script:

    • Combines scan-manifest.json + all analysis-batch-*.json files
    • Deduplicates entities (case-insensitive name matching)
    • Normalizes node/edge types via alias maps
    • Builds layers from index.md categories
    • Builds a tour from index.md section ordering
    • Writes assembled-graph.json to the intermediate directory
  3. Read the merge report from stderr and announce:

    • Total nodes, edges, layers, tour steps
    • How many entities/claims the LLM analysis added

Phase 5: SAVE

  1. Read the assembled-graph.json

  2. Run basic validation:

    • Every edge source/target must reference an existing node
    • Every node must have: id, type, name, summary, tags, complexity
    • Remove any edges with dangling references
  3. Copy the validated graph to <TARGET_DIR>/.understand-anything/knowledge-graph.json

  4. Write metadata to <TARGET_DIR>/.understand-anything/meta.json:

    {
      "lastAnalyzedAt": "<ISO timestamp>",
      "gitCommitHash": "<from git rev-parse HEAD or empty>",
      "version": "1.0.0",
      "analyzedFiles": <number of wiki articles>
    }
    
  5. Clean up intermediate files:

    rm -rf <TARGET_DIR>/.understand-anything/intermediate
    
  6. Report summary to the user:

    • "Knowledge graph saved: N articles, N entities, N topics, N claims, N sources"
    • "N edges (N wikilink, N categorized, N implicit)"
    • "N layers, N tour steps"
  7. Auto-trigger the dashboard:

    /understand-dashboard <TARGET_DIR>
    

Notes

  • The parse script handles ALL deterministic extraction (wikilinks, headings, frontmatter, categories from index.md). The LLM agents only add implicit knowledge that requires inference.
  • Categories and taxonomy come from index.md section headings, NOT from filename prefixes. The Karpathy spec is intentionally abstract about naming conventions.
  • The graph uses kind: "knowledge" to signal the dashboard to use force-directed layout instead of hierarchical dagre.
  • Source nodes from raw/ are lightweight (filename + size only) — we don't parse PDFs or binary files.
Related skills
Installs
63
GitHub Stars
12.9K
First Seen
Apr 12, 2026