project-knowledge-harness
project-knowledge-harness
A lightweight, file-based memory harness for any software project.
Three surfaces, sharply separated by time direction and access pattern:
| Surface | Time | Question it answers | Access pattern |
|---|---|---|---|
TODO.md |
Future | "What might we do later?" | Read top to bottom (priority lanes) |
backlog/<slug>.md |
Future | "What was the analysis behind this idea?" | Indexed from TODO.md |
pitfalls/<slug>.md |
Past | "I see error X — has this happened before?" | Grep symptom keywords |
Plus, in projects that already have agent contracts, AGENTS.md Hard
invariants answer "What rules MUST agents follow?". Pitfalls graduate to
Hard invariants when serious enough — see
references/when-to-add-docs.md.
When to use this skill
Use this skill when the user surfaces any of:
- A long-term TODO list with no home, or a messy
TODO.mdthat needs structure (signals: "where should ideas go?", "maybe later", "工程量太大需要再評估") - A debugging session worth saving (signals: "keep solving the same problem twice", "save this troubleshooting", "踩過的坑", "TROUBLESHOOTING.md scattered across docs/")
- A request to consolidate
IDEAS.md/ROADMAP.md/WISHLIST.md/LESSONS.mdfiles
Do NOT use this skill for active sprint planning (use issue trackers),
ephemeral agent scratchpads (use .claude/plans/), or current feature
documentation (use docs/).
How to apply this skill (default workflow)
The skill bundles five scripts and several templates. Default to the
init script for setup, and default to add-todo.sh / sweep-inbox.sh
for capture rather than asking the agent to edit TODO.md by hand.
scripts/
init.sh # one-shot setup of TODO.md + backlog/ + pitfalls/
# + agent guidance + README snippet
todo-kanban.sh # validate TODO.md format and render kanban-style board
add-todo.sh # insert a structured entry into the right ## P* lane
promote-todo.sh # move an active TODO item to ## Done with the right syntax
sweep-inbox.sh # triage backlog/inbox.md into TODO.md via add-todo.sh
1. Run scripts/init.sh against the target repo
scripts/init.sh \
--target /path/to/project \
--project-name "My Project" \
--deployment chezmoi # or npm | pip | docker | none
The script:
- Creates
TODO.md,backlog/README.md,pitfalls/README.mdfromassets/*.template, substituting placeholders. - Appends an agent-guidance snippet to
AGENTS.md/CLAUDE.md(auto-detected, override with--agent-contract). - Appends a "Roadmap & lessons learned" section to
README.md. - Runs
scripts/todo-kanban.sh --validate-only TODO.mdso any drift is caught immediately. - Prints the deployment-exclusion lines you should add manually (it does
not edit ignore files — see
references/deployment-exclusion.md).
init.sh is idempotent: existing files are skipped unless --force is
given, and snippets append only if a sentinel marker is missing.
2. Mid-conversation, when the user surfaces a "maybe later" idea
Signals: "maybe later", "nice to have", "if I'm interested", "工程量太大需要再評估", "先記下來", "not now but…".
-
If
TODO.mddoesn't exist yet, runscripts/init.shfirst. -
Default path — call
scripts/add-todo.shwith the priority, effort, title, and description. This inserts the canonical line into the right## P*lane and re-validates. Add--backlogif the conversation produced enough investigation that abacklog/<slug>.mddoc is worth scaffolding.scripts/add-todo.sh --priority "P?" --effort M \ --title "Try Rspress for docs" \ --description "Evaluate AI-native docs framework alternative" -
Quick-capture path — append to
backlog/inbox.mdwhen the user isn't sure of priority/effort yet, or is mid-thought:echo "- $UNSTRUCTURED_THOUGHT" >> backlog/inbox.mdLater (this session or next), run
scripts/sweep-inbox.shto formalize the loose lines intoTODO.mdone at a time. The sweeper prompts for missing fields per line; in--batchmode it only processes lines that already havepriority=… effort=… title="…" description="…"pairs and leaves ambiguous ones in place. -
Manual edit path — only when scripts can't help. Use the syntax from
references/tag-schema.mdand runscripts/todo-kanban.sh --validate-onlyafterwards. -
If you wrote a
backlog/<slug>.md(either via--backlogor by hand), make sure the TODO line ends with→ [research](backlog/<slug>.md)so the index points to the doc.
3. Mid-conversation, when you finish debugging something tricky
Signals: "phew, that took a while", "weird, the error didn't say anything about X", "this is the third time we've hit this", or you find yourself reconstructing context that isn't in any doc.
- Create
pitfalls/<symptom-slug>.mdimmediately fromassets/pitfall-doc.md.template, while the trace is fresh. Title the doc by the symptom, not the root cause — you'll search by what you're seeing, not by what you eventually learned. - Copy verbatim error messages — never paraphrase, it kills grep-ability.
- If the trap is severe (silent corruption / cross-machine recurrence /
non-obvious workaround), surface it: "should this graduate to a Hard
invariant in AGENTS.md?" — see
references/when-to-add-docs.md.
4. When implementing a TODO.md item
In the same commit:
- Run
scripts/promote-todo.sh --title "<substring>" --summary "<what shipped>". It removes the active line and inserts the datedDoneentry, then re-validates. It refuses to run if the substring matches zero or more than one active item. - If a
backlog/<slug>.mdexists, set itsStatus: shipped(don't delete — historical record may inform adjacent decisions). - If shipping uncovered a trap, write a
pitfalls/<slug>.mdfor it.
When a TODO entry needs a backlog/ doc, when a debug needs a pitfalls/ doc
Decision rules and the upgrade path to AGENTS.md invariants live in
references/when-to-add-docs.md. Read it
the first time you set up the harness; consult it again whenever you're
unsure which surface a piece of knowledge belongs in.
Tag schema
Two orthogonal axes, validated by scripts/todo-kanban.sh. See
references/tag-schema.md for the full schema,
useful tag combinations, and the exact validator-checked syntax.
Anti-patterns
Common mistakes to avoid (e.g., spawning IDEAS.md alongside TODO.md,
titling pitfalls by root cause, paraphrasing errors). Full list in
references/anti-patterns.md.
Deployment exclusion
TODO.md, backlog/, and pitfalls/ are maintainer-facing repo metadata,
not files to ship. Cheatsheet for chezmoi / npm / pip / Docker in
references/deployment-exclusion.md.
scripts/init.sh --deployment ... prints the exact lines to add.
Templates and bundled assets
assets/TODO.md.template—TODO.mdskeleton with example items per laneassets/backlog-README.md.template—backlog/index + when-to-add rulesassets/backlog-doc.md.template— single backlog doc (context-first)assets/pitfalls-README.md.template—pitfalls/index + cross-reference table for traps documented elsewhereassets/pitfall-doc.md.template— single pitfall doc (symptom-first)assets/agent-guidance.md.template— snippet forAGENTS.md/CLAUDE.mdassets/readme-roadmap.md.template— snippet for projectREADME.mdscripts/init.sh— one-shot setupscripts/todo-kanban.sh— validator + Markdown kanban renderer (also supports--jsonand--validate-only)scripts/add-todo.sh— structured insert into## P*lane;--backlogalso scaffoldsbacklog/<slug>.mdscripts/promote-todo.sh— atomic active-→-Done move with re-validationscripts/sweep-inbox.sh— triagebacklog/inbox.md→add-todo.sh
Reference implementation
A live example of this harness is at
daviddwlee84/dotfiles:
TODO.md,backlog/,pitfalls/directories at repo rootAGENTS.md### Long-term backlog → TODO.md + backlog/and### Past pitfalls → pitfalls/sectionsREADME.md## Roadmap & lessons learnedsection
More from daviddwlee84/agent-skills
agent-history-hygiene
Commit SpecStory chat transcripts (`.specstory/history/*.md`), Claude Code plan files (`.claude/plans/*.md`, `plansDirectory`), and other coding-agent artifacts (`.cursor/plans/`, `.cursor/rules/`, `.opencode/plans/`, `.specify/`, `.codex/`) alongside the feature diff they produced — without leaking `.env` contents, API keys, or private-key PEM blocks into git history. Use when the user says "commit my chat", "save this specstory session", "stage the plan file", "scrub the transcript", "my .env leaked in chat", "bootstrap pre-commit for this project", or when you notice untracked `.specstory/history/*.md` or `.claude/plans/*.md` files while running `git status`. Also use after an accidental push of a secret to enforce rotate-first, rewrite-last remediation instead of reflexive `git push --force`.
11mkdocs-site-bootstrap
Bootstrap MkDocs Material docs sites with optional GitHub Pages deploy, uv-pinned tooling, llms.txt/copy-to-LLM support, page/nav helpers, and mkdocs-static-i18n languages such as zh-TW. Use when the user asks to set up docs, publish docs to GitHub Pages, create an MkDocs site, turn README or markdown notes into a site, add bilingual/multilingual docs, add zh-TW/Traditional Chinese, i18n, or translate docs. Consent-gated; records repo preferences and never auto-migrates existing docs.
11pueue-job-queue
Drive Nukesor/pueue (https://github.com/Nukesor/pueue) for queued, parallel, scheduled, and lightly-DAG'd shell jobs — wraps `pueue add --after`, `pueue status --json`, `pueue log --json`, group-level parallelism, and `pueued` daemon health. Use when the user wants to background long-running shell commands across reboots, queue dozens of jobs with capped parallelism, run a fan-out / fan-in pipeline of shell steps, says "pueue", "pueued", "pueue add", "pueue queue", "pueue group", "task queue for shell", "background this job", or asks how to schedule/parallelize CLI work without a real orchestrator (Airflow/Prefect/Dagster). Good fit for ML sweeps, long-running data pipelines, batched evaluations, scheduled `--delay` jobs, "wait for X then run Y" sequences.
4marimo-notebook
Write a marimo notebook in a Python file in the right format.
3skill-author
Author a new agent skill or refactor an existing one to follow agentskills.io best practices — gotchas sections, output templates, validation loops, calibrated specificity (fragility-based), and agentic script design (--help, --dry-run, structured stdout, stderr diagnostics, PEP 723 inline deps, pinned uvx/npx versions). Use whenever the user wants to create a new skill from scratch, scaffold a SKILL.md, write a reference file, design a script meant to be invoked by an agent, lint a draft skill for quality, or convert an ad-hoc workflow into a reusable skill. For evaluating skill output quality with test cases, benchmarking, or optimizing the description trigger rate, defer to the `skill-creator` skill instead — this skill focuses on authoring, not evaluation.
2dvc-ml-workflow
Set up and operate a DVC (Data Version Control) workflow for ML projects — `dvc init`, `dvc.yaml` pipelines, `params.yaml`, `dvc exp run --queue` for parallel sweeps with metrics auto-bound to ephemeral git commits, and remote storage (S3/SSH/GDrive). Use whenever the user wants reproducible ML pipelines, data/model versioning that lives alongside git, parameter sweeps without standing up a tracking server, queued/parallel experiment execution, or asks about `dvc.yaml` / `dvc exp run` / `dvc queue` / `params.yaml` / `dvc add` / `dvc push` / `.dvc/cache`. Always references the official docs at https://dvc.org/doc and the upstream repo https://github.com/treeverse/dvc (Iterative was acquired by Treeverse in 2024 — `pip install dvc` resolves to this repo).
2