token-usage-auditor
Token Usage Auditor
Measure token usage as project telemetry: where agent attention went, how much fresh work versus cached context was spent, and which sessions or sidecar runs should be tied back to project phases and artifacts.
Skill Directory Layout
<installed-skill-dir>/
├── SKILL.md
├── scripts/
│ └── collect_token_usage.py
└── references/
└── data-model.md
Core Principles
- Treat token usage as attention and cost telemetry, not as quality by itself.
- Prefer local exact logs for Codex and Claude Code before asking the user for estimates.
- Include
.agent/sidecars/*/model.jsonmetadata, because ephemeral sidecar runs may not persist normal session logs. - Keep raw prompts and message text out of project memory by default; record only session metadata, usage totals, classification, and artifact links.
- Preserve agent-specific token fields. Do not collapse prompt cache reads, cache creation, fresh input, reasoning output, and normal output into one unexplained number.
- Mark inferred phase/task labels with confidence. Do not invent artifact links when no commit, report, run, or paper section can be tied to a session.
Quick Start
For the current project, run:
python3 <installed-skill-dir>/scripts/collect_token_usage.py --project-root .
For a recent window:
python3 <installed-skill-dir>/scripts/collect_token_usage.py --project-root . --since 2026-05-01 --until 2026-05-31
To write repo-local reports:
python3 <installed-skill-dir>/scripts/collect_token_usage.py --project-root . --format markdown --output docs/reports/token_usage/2026-05.md
python3 <installed-skill-dir>/scripts/collect_token_usage.py --project-root . --format json --output memory/token_ledger/sessions.json
The script is read-only with respect to Codex and Claude Code logs. It writes only when --output is provided.
Workflow
-
Identify the project root.
- Use
git rev-parse --show-toplevelwhen possible. - For research project control roots, use the shared root if the user wants cross-component accounting, or
code//paper/if they want component-local accounting.
- Use
-
Run the collector.
- Include
--sinceand--untilwhen the user asks for a week, month, phase, or release window. - Use
--format markdownfor discussion and--format jsonwhen updating project memory. - Use
--codex-rootor--claude-rootonly when logs live outside the defaults. - Use
--no-sidecarsonly when the user wants raw agent-session logs without sidecar metadata.
- Include
-
Interpret the report.
total_context_tokens: all context observed by the agent, including cached reads when the provider reports them.fresh_tokens: non-cached input plus cache creation plus output. Use this as the closer proxy for incremental cost/effort.cached_tokens: prompt-cache reads or cached input. Use this as context reuse, not equal fresh work.session_count: number of project-matched local sessions or recorded sidecar runs.codex-sidecar: repo-local.agent/sidecars/*/model.jsonrecords. Exact token fields appear only when the sidecar run copied Codex CLI usage intomodel.json.
-
Add project labels only when supported.
phase: idea, literature, design, implementation, experiment, diagnosis, writing, rebuttal, release, maintenance, tooling, project-management.task_type: design, implementation, debug, writing, review, release, sync, setup, experiment, diagnosis, literature, coordination.confidence: exact, inferred, manual, unknown.
-
Connect to artifacts.
- Link sessions to commits, PRs, experiment runs, reports, paper sections, or memory updates only when there is evidence from git history, docs, user notes, or the session metadata.
- If artifact linkage is not known, leave it blank and report the session as unlinked.
Updating Project Memory
Use references/data-model.md when writing long-lived project memory.
Recommended outputs:
docs/reports/token_usage/YYYY-MM.mdfor human-readable monthly reportsmemory/token_ledger/sessions.jsonfor structured session summariesmemory/token_ledger/README.mdonly if the project needs policy notes
Do not commit copied raw transcript logs unless the user explicitly asks and privacy has been reviewed.
Report Framing
When summarizing, separate these conclusions:
- attention allocation: what the project focused on
- fresh token burn: approximate incremental token cost
- context reuse: cached/read context that made work cheaper or less repetitive
- yield: shipped commits, reports, experiments, paper sections, decisions, or memory updates
- friction: repeated context setup, failed commands, abandoned branches, duplicate analysis, or high-burn sessions without artifacts
Avoid saying high token burn means good work. Prefer phrasing such as: "token usage was concentrated in experiment diagnosis; yield was decision-heavy rather than commit-heavy."
More from a-green-hand-jack/ml-research-skills
project-init
Initialize an ML research project control root. Use for paper/code/slides repos, shared memory, GitHub Project alignment, agent guidance, worktree policy, and lifecycle handoffs.
37project-sync
Sync verified code-side experiment results into paper memory. Use when logs, reports, run docs, or user-confirmed metrics should become paper-facing evidence.
36add-git-tag
Create annotated Git milestone tags. Use when completing a phase, releasing a version, or marking a research checkpoint.
36update-docs
Refresh project documentation after code changes. Use after implementing features, changing behavior, or preparing a milestone commit.
36init-latex-project
Initialize a LaTeX academic paper project. Use for new conference or journal papers needing templates, macros, venue preambles, and writing guidance.
36new-workspace
Create Git branches or worktrees for research code and paper versions. Use for experiments, baselines, rebuttal fixes, arXiv/camera-ready branches, and worktree memory.
36