taxonomy-builder
SKILL.md
Taxonomy Builder (router, compatibility mode)
Build outline/taxonomy.yml from papers/core_set.csv.
P0 compatibility note:
- The output contract stays the same (
outline/taxonomy.yml, YAML list, >=2 levels, concrete descriptions). - Curated domain taxonomies now live in
assets/domain_packs/*.yamlinstead of Python prose. scripts/run.pystays a deterministic scaffold/helper: detect domain pack -> load pack when available -> otherwise fall back to the generic builder.
Load Order
references/overview.mdreferences/taxonomy_principles.md- If a domain pack applies, read its
references/domain_pack_<domain>.mdandassets/domain_packs/<domain>.yaml - Otherwise read
references/archetypes_generic.md - Calibrate naming/description quality with
references/examples_good.mdandreferences/examples_bad.md
Current compatibility packs:
llm_agentsgen_imageembodied_ai
Inputs
papers/core_set.csv(required)- Optional:
papers/papers_dedup.jsonl - Optional:
DECISIONS.md,GOAL.md,queries.md
Outputs
outline/taxonomy.yml
Asset contract
assets/taxonomy_schema.json: machine-readable shape for domain packs / output expectationsassets/domain_packs/*.yaml: compatibility domain packs for supported domains
Script role
Use scripts/run.py only for deterministic help:
- never overwrite non-placeholder user taxonomy
- preserve current CLI flags / output path
- load supported domain taxonomies from assets instead of hard-coded Python prose
- keep the generic fallback builder for non-packed domains
When to refine manually
Refine the generated taxonomy before marking the unit DONE if:
- top-level buckets feel like keyword clusters instead of chapter-level questions
- leaf names are generic (
Overview,Benchmarks,Open Problems,Misc) - descriptions lack scope cues or representative paper anchors
- domain detection chose the wrong pack
Quick start
python .codex/skills/taxonomy-builder/scripts/run.py --helppython .codex/skills/taxonomy-builder/scripts/run.py --workspace <workspace_dir>
Execution notes
When running in compatibility mode, scripts/run.py currently reads:
papers/core_set.csvas the required corpus inputpapers/papers_dedup.jsonlwhen present for extra title/abstract signalsGOAL.md,queries.md, andDECISIONS.mdas optional domain/profile hints during pack selection
Script
Quick Start
python .codex/skills/taxonomy-builder/scripts/run.py --workspace <workspace_dir>
All Options
--workspace <dir>--top-k <int>--min-freq <int>--unit-id <id>--inputs <a;b;...>--outputs <a;b;...>--checkpoint <C*>
Examples
python .codex/skills/taxonomy-builder/scripts/run.py --workspace workspaces/<ws>
Troubleshooting
- If the wrong domain pack is chosen, inspect
GOAL.md,queries.md, and the packdetectrules before changing Python. - If
outline/taxonomy.ymlalready contains a real non-placeholder taxonomy, the script intentionally returns without overwriting it. - If no pack matches, the script falls back to the generic builder.
Weekly Installs
28
Repository
willoscar/resea…e-skillsGitHub Stars
304
First Seen
Jan 23, 2026
Security Audits
Installed on
claude-code23
gemini-cli23
codex22
cursor21
opencode21
github-copilot18