LLM Wiki

Turn your LLM into a Wiki maintainer. The LLM incrementally builds and maintains a persistent, interconnected Markdown knowledge base. Knowledge is compiled once and continuously updated, rather than re-derived each time.

Inspired by:

Karpathy - LLM Wiki — Incremental knowledge base architecture
Compound Engineering Plugin — Knowledge compounding: every problem-solving experience should make the next one easier

When to Use

The AI should proactively identify and use this skill when:

User wants to build a knowledge base - "Help me organize these materials", "I want to create a wiki"
User provides new learning materials - Shares an article, paper, book chapter that needs organizing
User wants to query existing knowledge - "What did I previously read about this concept?"
User wants to maintain wiki health - "Check the wiki for contradictions"
User solved a problem - "Done", "Fixed it", "That worked"
User is doing long-term research - Weeks/months-long research topics requiring knowledge accumulation

When NOT to use:

One-off Q&A that doesn't need persistent knowledge

Entry Point

Step 1: Determine User Intent

Based on what the user says, determine which operation to execute:

User Intent	Operation
"Create a knowledge base", "Initialize wiki"	init
"Help me process this article", "I have new materials", "Check out this link"	ingest
"Done", "Fixed it", "Problem solved"	compound
"What is X?", "Summarize Y for me"	query
"Check the wiki", "Clean up the knowledge base"	lint
Intent unclear	Ask user to choose

When intent is unclear, ask:

Which operation would you like to run?
  1. init     - Initialize knowledge base
  2. ingest   - Ingest new materials
  3. compound - Document problem-solving experience
  4. query    - Query existing knowledge
  5. lint     - Health check

Step 2: Check if Initialized

Before any operation (except init itself), check if the ~/llm-wiki/ directory exists:

Exists → Proceed with the target operation
Does not exist → Auto-execute init first without asking, then proceed with the target operation

# Detection method
ls ~/llm-wiki/wiki/index.md 2>/dev/null

Operation: init

When to Execute

User explicitly requests initialization
Auto-executed when ~/llm-wiki/ doesn't exist before other operations

Workflow

1. Create Directory Structure

mkdir -p ~/llm-wiki/raw/{articles,papers,books,notes,assets}
mkdir -p ~/llm-wiki/wiki/{entities,concepts,topics,sources,solutions}

Adjust raw/ subdirectories based on knowledge base topics mentioned in prior conversation.

2. Create index.md

# Wiki Index

## Overview
- [[overview]] - Overall summary

## Sources
<!-- Ingested raw material summaries, sorted by date descending -->

## Entities
<!-- People, organizations, products, etc., sorted by name -->

## Concepts
<!-- Theories, methods, terminology, etc., sorted by name -->

## Topics
<!-- Comprehensive analyses, comparisons, etc., sorted by name -->

## Solutions
<!-- Problem-solving experiences and insights (generated by compound) -->

3. Create log.md

# Wiki Log

<!-- Operation records appended here chronologically, format: ## [YYYY-MM-DD] operation | description -->

4. Create overview.md

---
type: overview
created: YYYY-MM-DD
---

# Knowledge Base Overview

> This Wiki is automatically maintained by LLM. You handle topic selection and questions; the LLM handles summarization, cross-referencing, archiving, and maintenance.

## Current Status
- Source count: 0
- Total pages: 0 (including index, log, overview)
- Last updated: -

## Key Findings
<!-- As knowledge accumulates, the most important findings will be summarized here -->

5. Output Confirmation

Wiki knowledge base initialized! ~/llm-wiki/

Next steps:
  - Put materials in the raw/ directory, I'll organize them (ingest)
  - Give me a link or text, I'll save and process it (ingest)
  - Tell me when you've solved a problem, I'll document it (compound)
  - Ask me about existing knowledge in the Wiki anytime (query)
  - Let me check the Wiki's health (lint)

If auto-initialized (not user-initiated), simplify output to one line: Auto-initialized knowledge base ~/llm-wiki/, then proceed with the target operation.

Operation: ingest

Process new raw materials and integrate knowledge into the Wiki. A single new material may affect 10-15 Wiki pages.

Workflow

1. Determine Materials to Process

By priority:

User specified specific material (link, text, file path) → Process only that material
User says "process new materials" → Scan raw/ for unprocessed files
User says "process all new materials" → Batch process

Determining processed/unprocessed: Compare raw/ files against the source frontmatter field in wiki/sources/ summary pages. Has a corresponding summary page = processed.

Found 3 unprocessed materials:
  1. raw/articles/attention-paper.pdf
  2. raw/notes/meeting-2026-04-05.md
  3. raw/papers/bert-paper.pdf

Process all, or select specific ones? (Default: all)

2. Save Raw Materials (URL/text only)

URL → Fetch and save to raw/articles/
Text → Save to raw/notes/
Existing file → Read directly

3. Read and Extract

Read the raw material, identifying core arguments, key entities, important concepts, data/facts, and relationships to other sources.

4. Discuss with User (Recommended, skip for batch processing)

Core points of this material:
1. ...
2. ...

Key entities/concepts involved: A, B, C
Which aspects would you like to focus on?

5. Create Source Summary Page

Create in wiki/sources/, filename: YYYY-MM-DD-short-name.md

---
type: source
date: YYYY-MM-DD
source: raw/path/to/file
tags: [tag1, tag2]
---

# Source: Title

## Key Points
- Point 1

## Key Quotes
> Original quote

## Relationships to Other Sources
- Corroborates [[other-source]] on X
- Contradicts [[contradicting-source]] on Y

## Derived Concepts
- [[concept-a]]

6. Update Entity and Concept Pages

For each entity and concept mentioned in the material:

Existing page → Append new information, cite source
New page → Create using template

Entity/concept page template:

---
type: entity  # or concept
created: YYYY-MM-DD
updated: YYYY-MM-DD
sources: [source-a, source-b]
---

# Name

## Definition
Brief description.

## Key Information
- Info point 1 (Source: [[source-a]])

## Relations
- Related concepts: [[concept-x]]

## Open Questions
- Unanswered questions

Note:

When new information contradicts existing content, keep both versions with clear annotations
Every factual claim must cite its source

7. Update Topic Pages (if needed)

8. Update index.md, overview.md

9. Append to log.md

## [YYYY-MM-DD] ingest | Material Title

- **Source**: raw/path/to/file
- **New pages**: page-a, page-b
- **Updated pages**: page-c, page-d
- **Impact scope**: N pages

10. Output Summary

Processing complete.

New:
  - Source summary: [[source-name]]
  - Entities: [[entity-a]], [[entity-b]]
  - Concepts: [[concept-c]]

Updated:
  - [[concept-d]] - Added details about X

Warning - Contradictions found:
  - Description of Y in [[concept-d]] is inconsistent with [[source-old]]

Operation: compound

Document problem-solving experiences into wiki/solutions/. Knowledge compounding: invest time researching once, document it, solve it in minutes next time.

When to Execute

User says "Done", "Fixed it", "Problem solved"
Just completed a valuable debugging, exploration, or analysis process
Discovered a pattern, trick, or best practice worth recording

Not worth recording: Typos, obvious minor fixes, one-off non-reproducible issues. Just tell the user why.

Dual Tracks

Bug Track (Problem Resolution): For fixing bugs, resolving errors.

---
type: solution
track: bug
date: YYYY-MM-DD
tags: [tag1, tag2]
---

# Problem Title

## Problem
1-2 sentence description.

## Symptoms
- Observable abnormal behavior

## Investigation
1. ❌ Attempt A → Reason for failure
2. ✅ Final solution

## Root Cause
Explanation of the cause.

## Solution
\`\`\`
// Before
...
// After
...
\`\`\`

## Prevention
How to avoid recurrence.

## Relations
- [[concept-a]]

Knowledge Track (Insights): For summarizing patterns, best practices, workflow tips.

---
type: solution
track: knowledge
date: YYYY-MM-DD
tags: [tag1, tag2]
---

# Insight Title

## Background
Context in which this experience was gained.

## Guidance
Specific practices, patterns, or recommendations.

## Why It Matters
Impact of following or not following this practice.

## When to Apply
Conditions under which this experience applies.

## Relations
- [[concept-a]]

Workflow

Extract information from context — Problem description, investigation process, root cause, solution, key code
Choose track — Solved a specific problem → Bug Track; Summarized experience/pattern → Knowledge Track
Check for overlap — Search wiki/solutions/ for similar documents. High overlap → Update existing; Low or none → Create new
Write document — wiki/solutions/YYYY-MM-DD-short-name.md
Update index.md, overview.md, log.md
Output summary

Operation: query

Answer questions based on Wiki content. Good answers are archived back into the Wiki.

Core Principle

Good answers should be archived back into the Wiki. Multi-source synthesis, comparison tables, new discoveries → Save as new topic pages.

Workflow

Read index.md to understand the full picture
Locate relevant pages — Find the 2-5 most relevant pages (including sources, solutions)
Synthesize answer — Cite with [[wikilink]], annotate sources
Archive valuable answers — Save to wiki/topics/ as new pages
Suggest further exploration — Information gaps, materials that could be补充

Operation: lint

Detect contradictions, orphan pages, missing concepts, and other issues to maintain long-term Wiki health.

When to Execute

User says "check the wiki", "clean up the knowledge base"
Periodically when Wiki accumulates 20+ pages
After adding a batch of important materials

6 Checks

Check	Method
Contradiction detection	Compare descriptions of the same topic across different pages
Outdated information	Page `updated` date is much earlier than related sources
Orphan pages	Pages with 0 inbound `[[wikilink]]`
Missing pages	Referenced by `[[wikilink]]` but not yet created
Missing cross-references	Pages sharing 2+ sources but not linked to each other
Data gaps	"Open questions" on concept pages, unexplored directions in overview

Workflow

Read full picture (ls -R wiki/ + wiki/index.md)
Check each item
Generate report (statistics + issues listed by priority)
Ask user whether to auto-fix (create missing pages, add cross-references, etc.)
Execute fixes, update log.md

Two-Layer Architecture

~/llm-wiki/
├── raw/                    # Raw materials (immutable)
│   ├── articles/
│   ├── papers/
│   ├── books/
│   ├── notes/
│   └── assets/
└── wiki/                   # LLM-maintained Wiki
    ├── index.md            # Table of contents
    ├── log.md              # Operation log
    ├── overview.md         # Overall overview
    ├── entities/           # Entity pages
    ├── concepts/           # Concept pages
    ├── topics/             # Topic pages
    ├── sources/            # Source summaries
    └── solutions/          # Solution documents

File Naming Convention

Type	Path Format
Source summary	`wiki/sources/YYYY-MM-DD-short-name.md`
Entity page	`wiki/entities/name.md`
Concept page	`wiki/concepts/concept-name.md`
Topic page	`wiki/topics/topic-name.md`
Solution document	`wiki/solutions/YYYY-MM-DD-short-name.md`

All filenames use lowercase English with hyphens.

Writing Standards

Start each page with YAML frontmatter (type, date, tags, sources)
Use [[wikilink]] for inter-page links
Cite sources for every factual claim
When new and old information contradict, keep both versions annotated
Keep pages concise, focused on one topic

Obsidian Integration

The Wiki directory can be opened directly with Obsidian: Graph View, Web Clipper, Dataview, Marp plugins.

Examples

ingest

User: I put the Attention Is All You Need paper in raw/papers/

Agent: I'll process this paper.

[Reads raw/papers/attention-is-all-you-need.pdf]
[Extracts key information]

Core points of this paper:
1. Proposed the Transformer architecture, entirely based on attention mechanisms, abandoning RNN/CNN
2. Introduced Multi-Head Self-Attention and Positional Encoding
3. Achieved SOTA on machine translation tasks

Entities involved: Google Brain, Ashish Vaswani and 7 other authors
Concepts involved: Self-Attention, Multi-Head Attention, Positional Encoding, Layer Normalization

Which aspects would you like to focus on?

User: Focus on the design rationale behind Self-Attention

Agent: [Creates wiki/sources/2026-04-06-attention-paper.md]
[Creates wiki/concepts/self-attention.md - Focuses on design rationale]
[Creates wiki/entities/transformer.md]
[Creates wiki/entities/google-brain.md]
[Creates wiki/concepts/multi-head-attention.md]
[Creates wiki/concepts/positional-encoding.md]
[Updates wiki/overview.md]
[Updates wiki/index.md]
[Appends to wiki/log.md]

Processing complete. 6 new pages created. The Self-Attention concept page focuses on the design motivation behind Scaled Dot-Product.

compound

Bug Track:

User: Fixed it, it was a database connection pool configuration issue

Agent: This is worth documenting. Let me organize it.

[Reviews conversation context]
[Checks wiki/solutions/ for overlap]
[Creates wiki/solutions/2026-04-06-db-connection-pool-exhausted.md]

Experience documented. Bug Track.

Summary:
  Problem: Database connection pool exhaustion under high concurrency, request timeouts
  Root cause: Default pool size of 5 was insufficient, no idle timeout recycling configured
  Solution: Adjusted pool size to 20, added idle timeout configuration

Linked to existing [[database-connections]] concept page in the Wiki.

Knowledge Track:

User: I noticed that debounce works better than throttle for search input handling

Agent: Great insight, let me document that.

[Creates wiki/solutions/2026-04-06-debounce-vs-throttle-search.md]
[Checks for related concept pages]

Experience documented. Knowledge Track.

Core recommendation: Use debounce for search input scenarios (waits until user stops typing),
over throttle (fires at fixed intervals), because search needs complete keywords.

There are no debounce or throttle concept pages in the Wiki yet. Should I create them?

query

User: What's the difference between Self-Attention and regular Attention?

Agent: [Reads wiki/index.md]
[Locates wiki/concepts/self-attention.md, wiki/concepts/attention.md]
[Synthesizes answer]

## Self-Attention vs Regular Attention

**Regular Attention** (Source: [[source-a]]):
- Query and Key/Value come from different sequences
- Typical scenario: target language attending to source language in translation

**Self-Attention** (Source: [[2026-04-06-attention-paper]]):
- Query, Key, and Value all come from the same sequence
- Each position can attend to all other positions in the sequence

| Dimension | Regular Attention | Self-Attention |
|-----------|-------------------|----------------|
| Q/K/V Source | Q ≠ KV | Q = K = V |
| Captures | Cross-sequence | Within-sequence |
| Complexity | O(n×m) | O(n²) |

This comparison has been saved as [[self-attention-vs-attention]].

lint

User: Check the wiki

Agent: [Reads wiki/index.md]
[Scans all wiki/ pages]
[Checks each item]

#### Orphan Pages (1)
1. [[layer-normalization]] - 0 inbound links

#### Missing Pages (2)
1. [[residual-connection]] - Referenced 4 times
2. [[softmax]] - Referenced 2 times

### Suggested Actions
1. [High] Create [[residual-connection]] (referenced 4 times)
2. [High] Create [[softmax]] (referenced 2 times)
3. [Medium] Add to [[positional-encoding]]: different architectures use different schemes
4. [Low] Add inbound links to [[layer-normalization]]

Should I auto-fix these?

User: Create 1 and 2 first, I'll look at 3 and 4 later

Agent: Alright, I'll create residual-connection and softmax.

[Creates wiki/concepts/residual-connection.md]
[Creates wiki/concepts/softmax.md]
[Updates wiki/index.md]
[Appends to wiki/log.md]

Created:
- [[residual-connection]] - Based on description from [[attention-paper]]
- [[softmax]] - As a key computation step in Self-Attention

Leaving contradictions and orphan pages for now. You can ask me to handle them anytime.

Use Cases

Active Learning (ingest):

Academic research, reading notes, competitive analysis, course notes

Experience Accumulation (compound):

Engineering practices (bug fixes, best practices), team knowledge base, workflow optimization, personal growth

llm-wiki-en

LLM Wiki

When to Use

Entry Point

Step 1: Determine User Intent

Step 2: Check if Initialized

Operation: init

When to Execute

Workflow

1. Create Directory Structure

2. Create index.md

3. Create log.md

4. Create overview.md

5. Output Confirmation

Operation: ingest

Workflow

1. Determine Materials to Process

2. Save Raw Materials (URL/text only)

3. Read and Extract

4. Discuss with User (Recommended, skip for batch processing)

5. Create Source Summary Page

6. Update Entity and Concept Pages

7. Update Topic Pages (if needed)

8. Update index.md, overview.md

9. Append to log.md

10. Output Summary

Operation: compound

When to Execute

Dual Tracks

Workflow

Operation: query

Core Principle

Workflow

Operation: lint

When to Execute

6 Checks

Workflow

Two-Layer Architecture

File Naming Convention

Writing Standards

Obsidian Integration

Examples

ingest

compound

query

lint

Use Cases