ccbox-insights
ccbox-insights
Use ccbox session logs to produce an evidence-based "lessons learned" memo for the agent (and humans), plus copy-ready instruction snippets that improve future sessions.
Tool-call failures are an important signal, but the goal is broader than errors: capture what worked, what did not, and what should become standing behavior in the future.
Requirements
ccboxon your$PATH.- Access to the local sessions directory scanned by
ccbox(seeccbox --helpif discovery looks empty).
Quick start (single session)
- Find the latest session for the current folder:
ccbox sessions --limit 5 --offset 0 --size
- Inspect the latest session timeline (increase
--limitif needed):
ccbox history --full --limit 200 --offset 0
Workflow (recommended)
Follow a staged pipeline: collect -> filter -> summarize -> label -> aggregate -> synthesize -> propose instructions.
Stage 0: Choose scope
- Session: one
.jsonllog for deep root-cause analysis. - Project: last N sessions for one project to find recurring failure patterns.
- Global: a sample across projects to find cross-project patterns.
Stage 1: Collect evidence with ccbox
- Project discovery:
ccbox projects - Session listing:
ccbox sessions [project-path] --limit N --offset 0 --size - Timeline capture:
ccbox history [log-or-project] --full --limit N --offset 0 - Skill spans (optional):
ccbox skills [log-or-project] --json
When triaging quickly, scan the timeline for failure signals (examples): error, failed, non-zero, rejected, permission, timeout.
Stage 1.5: Identify clarifications, corrections, and interruptions
Treat user clarifications/corrections as high-signal evidence of workflow breakdowns. Your goal is to pinpoint where the agent went off-track and what rule would prevent it next time.
Look for user messages that:
- Clarify intent ("I meant X", "not that", "use Y instead").
- Correct mistakes ("this is wrong", "stop", "revert", "you didn't follow the instructions").
- Restate constraints after the fact ("do not run X", "no emojis", "do not use cargo", "do not change version", "do not release automatically").
- Interrupt the session due to friction (hangs, repeated retries, "cancel", abandoning the thread).
For each such moment, capture a small "course-correction record":
- Trigger: what the agent did immediately before (tool call, plan, edit, or assumption).
- Correction: the exact user sentence(s) that clarified/corrected.
- Fix: what changed after the correction (new approach, different tool, narrower scope).
- Lesson: one rule that would have prevented the detour (copy-ready, scoped project/global).
If the session ends without a clear resolution (or the user abandons it), mark the outcome accordingly and explain the likely interruption reason using evidence (for example: hang/timeout, repeated invalid tool use, conflicting constraints).
Stage 2: Summarize long timelines (only if needed)
If the timeline is too large to analyze in one pass, summarize in chunks:
- Focus on: user request, tool calls, tool outputs/errors, and outcome.
- Preserve: tool names, command lines, error messages, and user feedback.
- Keep each chunk summary to 3-5 sentences.
Stage 3: Extract per-session tool-call facets
For each session in scope, produce one JSON object matching references/facets.md.
If skill spans are available (via ccbox skills --json), annotate each failure with the active skill context when possible (e.g., "this failure happened inside the commit skill span").
Hard rules:
- Use only evidence from the session log; do not guess missing details.
- Separate "tool failed" from "wrong approach" (a tool can succeed but still be the wrong move).
- Count explicit user rejections as their own category (the tool did not fail; the action was declined).
Stage 4: Aggregate and analyze
Aggregate the facet set to produce:
- Top failing tools and failure categories.
- Three root-cause themes with concrete evidence snippets.
- Repeated user constraints that should become standing instructions.
- Engine-neutral recommendations that reduce tool-call failures and improve UX.
Use references/report.md as the output template.
Stage 5: Propose instruction updates (project + global)
Produce two additive sets of copy-ready snippets:
- Project-level: bullets to add to
AGENTS.md. - Global: bullets for the user's global agent instructions.
Guidelines:
- Do not include local paths, repository names, or one-off incident details.
- Prefer rules that prevent repeated errors (2+ occurrences) over one-time fixes.
- Each instruction should include: what to do, what to avoid, and why (1 sentence).
Stage 6: Deliverables
Deliver:
- A concise lessons learned memo (Markdown), following
references/report.md. - Proposed additive
AGENTS.mdsnippet (project-level). - Proposed additive global instruction snippet.
Optional (for future memory systems):
- "AutoMemorial candidates": a short, structured list of rules that should become standing agent memory, each backed by evidence and scoped (project vs global).
References
- Facet schema + taxonomy:
references/facets.md - Report template + instruction templates:
references/report.md