session-titles

Installation

SKILL.md

Overview

This skill owns the entire session title lifecycle:

Generation -- Stop hook extracts context from the transcript (primary request, branch, files) and calls Haiku to produce a 4-7 word active-voice title. Detects focus shifts and tracks them with a (N) prefix.
Feedback -- Each generated title is saved as a pending feedback entry for later scoring.
Rating -- Interactive workflow where an AI judge scores first, then the human confirms or corrects. Builds dual-perspective training data.
Evaluation -- Pattern checks (fallback, meta-language, too long, etc.) plus optional LLM judge scoring across all pending entries.
Evolution -- GEPA-inspired prompt mutation: reflect on failures, propose targeted changes, keep improvements.
Golden dataset -- Extract candidates from real sessions, curate ideal titles, run regression evals.

How It Works

Session ends
  --> Stop hook (hooks/stop.sh)
    --> scripts/generate.ts  (stdin: session_id, cwd, transcript_path)
      --> generate-core.ts
        1. extractSessionContext()  -- parse transcript JSONL
        2. evolveTitleWithContext()  -- initial title or shift detection via Haiku
        3. sanitizeTitle()          -- strip preambles, enforce length
        4. savePendingFeedback()    -- append to title-feedback/pending.jsonl

Scripts

All scripts support --help style flags. Run with bun.

Script	Purpose
`generate.ts`	Hook entry point. Reads JSON from stdin.
`generate-core.ts`	Core module: context extraction, title generation, shift detection.
`generate-core.test.ts`	Unit + integration tests. `bun test scripts/generate-core.test.ts`
`schema.ts`	`TitleFeedback` types, prompt version constants.
`store.ts`	JSONL persistence for pending/scored feedback.
`eval-quality.ts`	Pattern checks + optional `--judge` LLM scoring.
`evolve-prompt.ts`	GEPA evolution. `--iterations N`, `--pareto-size N`.
`extract-candidates.ts`	Pull test cases from session transcripts. `--limit N`, `--project NAME`.
`run-eval.ts`	Run golden dataset eval. `--judge-model MODEL`.
`report.ts`	Generate report from latest eval results. `--file PATH`.

Rate Title

Interactive rating workflow (invoke as /rate-title or manually):

AI Judge assesses first -- Score (1-5), reasoning, proposed better title.
Human calibrates -- Agree? Different score? Better suggestion?
Both perspectives saved to ~/.claude/title-feedback/scored.jsonl.

The dual-perspective data enables DSPy optimization of both the judge prompt (learn to rate like the human) and the journalist prompt (generate titles humans rate highly).

Rating Scale

Score	Meaning
5	Perfect -- specific, actionable, concise
4	Good -- minor phrasing improvements possible
3	Acceptable -- gets the gist but generic
2	Poor -- too vague or wrong focus
1	Bad -- completely off-base or misleading

See references/scoring-rubric.md for detailed criteria.

Data Layout

Runtime data (gitignored, at ~/.claude/title-feedback/):

pending.jsonl -- written by Stop hook
scored.jsonl -- written by /rate-title

Evaluation data (gitignored, at skills/session-titles/data/):

candidates.jsonl -- extracted test cases
golden.jsonl -- curated with ideal titles
results/ -- timestamped eval outputs
baseline-*.md, evolution-*.md -- quality reports

References

references/scoring-rubric.md -- 5-point rating criteria
references/adaptive-title-plan.md -- Roadmap: phases, LanceDB vectors, DSPy optimization

Related skills

More from fairchild/dotclaude

Installs

Repository

fairchild/dotclaude

GitHub Stars

First Seen

Mar 10, 2026

Security Audits

Gen Agent Trust HubFail

SocketPass

SnykWarn

session-titles

Overview

How It Works

Scripts

Rate Title

Rating Scale

Data Layout

References

More from fairchild/dotclaude

skills-manager

youtube-content

backlog

update-dependencies

image-gen

fork