task
Installation
SKILL.md
Work Task
Handle $ARGUMENTS. Be thorough, not ceremonial. Start from the source of truth, load extra skills only when they earn their keep, and verify before calling the task done.
#$ARGUMENTS
Core Rules
- Read the task source first.
- Read local repo instructions and relevant files before editing.
- Search for existing patterns before inventing new ones.
- Prefer the best durable fix over the smallest local patch. If the root cause lives in an API or abstraction seam, change that seam even when it means a broader refactor or an intentional API change.
- Prefer targeted tests and checks during iteration.
- Keep the user updated at milestones.
- Verify the actual result before claiming done.
- Do not default to research swarms, review swarms, or browser proof.
- For verified code-changing work, default to creating or updating the PR unless the user explicitly said not to.
- Do not default to compounding.
- Before calling a task blocked on a repo-wide gate, rule out local install corruption once when the failure smells wrong for the diff.
Intake
- Classify the input:
- Plain task text: the user prompt is the source of truth.
- File path or spec path: read it first.
- GitHub issue URL: fetch it with
gh issue viewfirst. - GitHub PR URL: fetch it with
gh pr viewfirst. - Bare GitHub issue like
#555: resolve it against the currentghrepo first, then fetch it withgh issue view. - Linear issue link/id: fetch it with the Linear integration first.
- Read the full source-of-truth context before doing anything else.
- If the task comes from a ticket or issue, also read the comments and attachments when available.
- If the task comes from a tracker item and any attachment or linked media is a video or screen recording, you must have transcript output with
video-transcriptsbefore implementation.- The canonical shared cache lives in the tracker next to the evidence it describes.
- Cache comment body should be XML only:
```xml
<video-transcripts>
<video-transcript
source-key="https://tracker-hosted-video/<stable-path-without-query>"
>
[00:00] (...)
</video-transcript>
</video-transcripts>
```
source-keymust be the normalized stable identifier for the media. For signed tracker-hosted URLs likeuploads.linear.app, strip the query string so rotating signatures do not bust the cache.- Do not add decorative metadata like
titleto cached<video-transcript>entries unless a later workflow truly needs it. - Group videos by source container before transcribing:
- one cache comment for issue or PR body videos
- GitHub: one dedicated top-level cache comment for each issue or PR comment that contains video(s)
- Linear: one cache reply for each comment that contains video(s)
- Keep the cache body pure XML. Do not prepend markers, prose, or YAML.
- Before running the helper, scan existing tracker comments and, when relevant, Linear replies. Match by source container first, then check whether the cached
<video-transcript ... source-key="...">entries cover every current video URL for that container after the same normalization. - If the matching cache fully covers the current normalized video keys for that source container, reuse the cached XML block and do not re-transcribe.
- If the cache is missing any current video, or that source container has new video evidence, transcribe only the missing videos and create or update the matching cache comment or reply.
- Use the helper in $video-transcripts.
- Run it once per relevant video attachment or URL.
- For auth-gated tracker media like
uploads.linear.appor private GitHub asset URLs, use the helper directly before declaring the video blocked. It can reuse local tracker auth when available. - Convert the video evidence into a normalized block in this exact shape before continuing:
<video-transcripts>
<video-transcript source-key="...">
[00:00] (...)
</video-transcript>
</video-transcripts>
- Use the generated or cached transcript output as part of the tracker context.
- Do not hand-write, paraphrase, or skip video evidence when the skill can run.
- If the helper still cannot fetch or transcribe the media after a real attempt, hard stop and report the exact blocker you hit.
- Do not continue implementation without the transcript when a real video is present.
- If there are multiple videos, preserve each as its own
<video-transcript ...>block under one<video-transcripts>wrapper. - Classify the task shape before choosing a workflow:
- Testing or coverage work: triggered by coverage, regressions, test-suite phases, hotspots, package testing, or similar language.
- Program or batch work: triggered by multiple packages, phases, buckets, or ordered slices.
- Ordinary one-shot work: bug, feature, refactor, docs, review, or investigation that can be finished as a single slice.
- Classify whether this is heavyweight framework or library work:
- Heavyweight work: architecture or public API redesign, breaking changes, major cross-package refactors, benchmarking, profiling strategy, performance comparison, scalability work, framework comparison, migration analysis, RFCs, proposals, or spec-first major changes.
- Non-heavyweight work: ordinary bugs, one-package features, docs-only edits, routine test work, small refactors, or normal issue execution even when non-trivial.
- If the task is heavyweight work:
- load
major-taskimmediately - treat
major-taskas the source of truth for workflow and helper selection - do not quietly inflate this task flow into a research swarm
- load
- If the task is not heavyweight work, classify task complexity before implementation:
- Non-trivial task: multi-step work, research-heavy work, phased execution, or anything likely to take more than a handful of tool calls.
- Trivial task: quick question, small edit, or other work that does not need persistent working memory.
- If the task is non-trivial:
- load
planning-with-filesbefore implementation - use persistent planning files or the repo's equivalent planning structure so progress survives context loss
- follow local repo overrides for where planning files live
- if the task touches published package code under
packages/, record in the plan whether a changeset will be required before completion - if the task is registry-only work under
apps/www/src/registry/, record in the plan whetherdocs/components/changelog.mdxmust be updated instead of creating a package changeset
- If the task is testing or coverage work:
- restate it as test work, not generic feature work
- load
testingfirst and use that testing policy as the source of truth - choose the smallest honest seam before loading
tdd
- If the task is program or batch work:
- restate the ordered scope and hard constraints
- do not treat the whole batch as one implementation unit
- default to completing the first slice cleanly unless the user explicitly asks for a broader sweep
- GitHub issue rules:
- Resolve bare issue numbers like
#555against the current repo withgh repo view --json nameWithOwner -q '.nameWithOwner'. - Fetch GitHub issues with:
gh issue view <number-or-url> --comments --json number,title,body,comments,labels,state,url - Fetch GitHub PRs with
gh pr view ... --json. - If the input is ambiguous and not clearly a GitHub issue token or URL, do not guess.
- For any tracker source, restate for yourself:
- source type
- source id
- exact title
- task type: bug, feature, refactor, docs, review, investigation, testing, or batch work
- expected outcome or acceptance criteria
- caveats, blockers, or missing information from the tracker
- likely files, routes, or packages affected
- whether there is a real browser surface to verify
- likely root-cause layer: call site, helper, abstraction seam, or public API
- Read repo instructions and nearby implementation patterns before editing.
- If the task changes code:
- if already on a branch clearly dedicated to this exact task, continue there
- treat a branch as "clearly dedicated" only when the branch name, tracker id, or existing PR obviously matches the current task
- do not reuse an unrelated non-
mainbranch just because you are already on it - if the current branch is not clearly dedicated to this task, check out
main, pull the latestmain, then create a fresh repo-convention branch before editing - if the task has a tracker id, prefer a branch name that includes it:
- GitHub issue:
codex/555-short-slug - Linear issue:
codex/LIN-123-short-slug
- GitHub issue:
- in this repo, otherwise prefer
codex/<slug> - run install or setup only when the repo or task actually needs it
- If the task does not change code, skip branch and setup noise.
- If anything important is still ambiguous after the source-of-truth pass and nearby code reading, ask the user the smallest useful clarifying question.
Tracked Task Rules
Apply this section only when the task source is a tracker item.
GitHub
- Treat the GitHub issue as the source of truth.
- Use
ghfor fetch and sync-back. - If useful, rename the thread to
<issue-number> <issue-title>. - If the task is code-changing, prefer a branch name that includes the issue number.
- If the task changed code and reached a verified meaningful outcome, create or update the PR before any issue comment unless blocked or the user said not to.
- If the task reaches a meaningful outcome and came from the issue, post a concise issue comment after the PR exists unless blocked or the user said not to.
Linear
- Keep the same fetch-first behavior as the dedicated Linear workflow.
- Read the issue, comments, and attachments before implementation.
- Keep comment-back QA-focused.
- Do not force browser proof unless the task actually has a browser surface.
Tracked Task Non-Rules
- Do not require PR creation for tracker tasks that did not change code, ended blocked, or were purely investigative.
- Do not require browser screenshots for every tracked task.
- Do not require tracker comments for investigations that ended blocked or inconclusive unless sync-back is useful.
Load Skills Only When Justified
planning-with-filesUse for any non-trivial task that needs persistent working memory, phased execution, or likely more than a handful of tool calls. Follow repo-specific overrides for where planning artifacts should live.major-taskUse for architecture or public API redesign, benchmarking or scalability work, framework comparison or migration analysis, major cross-package refactors, or RFC and proposal work. When it triggers during intake, it becomes the source of truth instead of this file.testingUse when the task is primarily about tests, coverage, regression gaps, or phase-based suite work. Load it beforetddfor testing programs.tddUse for bugs and for feature work where behavior-level automated coverage is sane. For testing programs, load it only after choosing a concrete slice that should be done test-first. Skip for trivial docs, mechanical refactors, or painful UI-only plumbing.learnings-researcherUse for non-trivial work in a domain with documented solutions or when the task smells like a repeated issue. Do not load it for tiny isolated edits.debugUse when the failure mode is still fuzzy after the first repro pass or first failing test.video-transcriptsUse when a tracker issue, PR, comment thread, or attachment set includes any video or screen recording that should become structured transcript evidence before implementation.ce:brainstormUse when requirements are still ambiguous after reading the source of truth and nearby code.framework-docs-researcherUse when touching unfamiliar, version-sensitive, or unstable third-party APIs after checking local clones and docs per AGENTS.dev-browserUse only when there is a real browser surface to verify. Require real browser proof only for browser or UI tasks.agent-browser-issueUse when browser automation is blocked by a likely reusable tool-side issue that should be split into its own GitHub follow-up.changesetUse when verified work changes a published package underpackages/and the repo expects release notes before completion. Do not create a package changeset for registry-only work underapps/www/src/registry/; updatedocs/components/changelog.mdxinstead.git-commit-push-prUse when verified work changed code and should ship as a PR. Create or update the PR before any tracker comment. The PR description should be the exact current final handoff, not a rewritten summary. If the task changes again in a later iteration, update the PR description to match the latest handoff.ce-compoundUse only after verified, non-trivial work that produced reusable knowledge. Never load it at the start.ce-review,correctness-reviewer,maintainability-reviewer,project-standards-reviewer,code-simplicity-reviewerUse only for risky, large, user-facing, or architecture-sensitive changes.agent-native-reviewerUse only when the change touches.agents/**,.claude/**, AI/tooling surfaces, commands, or user actions that an agent should also be able to perform.
Execution Path
Bug
- Reproduce first if possible.
- If behavior-level coverage is sane, turn the repro into the smallest useful automated test and watch it fail for the right reason.
- Find the real ownership boundary of the bug. Prefer fixing the abstraction, contract, or API that caused it instead of patching each caller around it.
- If the best fix requires an API change, make it unless the task source or repo constraints explicitly rule that out. Do not preserve a bad API just to keep the diff small.
- If you intentionally choose a narrower fix, state why the broader architecture fix was not worth shipping now.
- Re-run targeted checks.
- Re-run the browser flow only if the bug lives on a browser surface.
Feature
- Reduce the task to the smallest slice that fully satisfies the acceptance criteria.
- Add behavior coverage when sane.
- Prefer the cleanest long-term design that fits the slice, not the quickest bolt-on.
- If existing patterns are the reason the design is weak, improve the pattern or API instead of copying it blindly.
- Verify the user-facing outcome.
Testing Or Coverage Work
- Use the testing policy as the rulebook before choosing seams or commands.
- Identify the smallest honest seam for the first hotspot or ordered slice.
- Add or deepen the narrowest useful tests instead of spraying smoke coverage.
- Verify with targeted test commands first.
- Use broader suite checks only when repo rules or change scope justify them.
Program Or Batch Work
- Respect the explicit order from the task source.
- Do not fan out across every listed slice immediately.
- Define what "done for this slice" means before implementation.
- Complete one slice at a time unless the user explicitly asks for broader execution.
Refactor Or Chore
- Preserve behavior.
- Do not do fake TDD theater.
- If the task exposes a bad API or abstraction, fix that seam instead of polishing around it.
- Run the narrowest regression checks plus the relevant build, typecheck, or lint path for the touched area.
Docs Or Content
- Skip engineering ceremony.
- Verify links, examples, formatting, and rendered output as appropriate.
Review Or Investigation
- Read the relevant diff, files, and surrounding context first.
- For review tasks, report findings first, ordered by severity, with concrete file references.
- For investigation tasks, identify the failure mode, probable cause, and next action before changing code.
- Only implement changes if the user actually asked for them.
Verification
Keep verification mandatory but proportional.
- Run targeted tests for changed behavior.
- Run package or app build and typecheck when relevant to the touched area.
- Run lint when code changed and the repo expects it.
- Run browser verification only for browser or UI tasks.
- Run broader repo-wide gates only when repo instructions require them or the change scope justifies them.
- If a failing
pnpm test,bun test, orpnpm checkshows local-corruption signals unrelated to the current diff, runpnpm run reinstallonce and rerun the exact failing command before declaring the task blocked. Treat these as local-corruption signals:Invalid hook callresolveDispatcher()/ null dispatcher crashes- package-local
node_modules/reactornode_modules/react-dompaths underpackages/* - mixed
.bunand.pnpmReact paths in the same failing stack - missing-module or package-resolution garbage that does not match the diff
- If the repo has a standard final gate, run it last.
- If verified work changes a published package under
packages/, ensure the required changeset exists before PR or final handoff. - If the work is registry-only under
apps/www/src/registry/, updatedocs/components/changelog.mdxinstead of creating a package changeset. - If verified work changed code, create or update the PR before tracker sync-back unless the user explicitly said not to.
- If the task came from a tracked issue and the task reached a meaningful outcome, sync back unless the user said not to.
- If UI changed, capture proof from the real browser surface.
- Do not hardcode PR creation, screenshots, or tracker comments for every task.
Final Handoff
Every final response must include:
- follow repo writing style here too:
- be extremely concise
- sacrifice grammar for concision
- no filler, no narration, no polite padding
- leading flat metadata lines in this exact order:
๐ PR ...๐ Fixes ...for bug issues, otherwise use a generic issue line๐ข 95-100% confidence
- then the flow table in this exact format:
| Phase | ๐งช Tests | ๐ Browser || --- | --- | --- |
- use these flow rows, in this order:
ReproducedVerified
- use markdown links for
PRandIssuewhen they exist - for non-bug tracked work, do not fake a bug label; use a neutral issue line instead
- keep the issue line stable; do not add issue comment links there
- use these exact status values in the flow table:
โโโ N/A- prefer specific browser caveat text over vague
โ ๏ธ Partial, for exampleโ ๏ธ Could not automate dropdown
- in the
๐งช Testscolumn:- use
๐ดinReproducedwhen there was a real failing test first - use
๐ขinVerifiedwhen that test passed after the fix - use
โwhen test evidence exists but did not follow a real red-green loop
- use
- use
โ N/Afor rows or cells that do not apply; do not invent a PR, issue, or comment - flow-table test cells mean test-based evidence, whether that came from TDD, a regression test, or another targeted test path
- if manual non-browser reproduction or verification happened, explain it in the prose below the metadata lines + flow table rather than adding extra rows
Confidencemust stay100%or lower and use this line format:๐ข 95-100% confidence๐ก 80-94% confidence๐ด below 80% confidence
- after the metadata lines + flow table, use these short sections in this order:
**๐ Browser Check**, only when browser verification applies**โ Outcome****โ ๏ธ Caveat****๐๏ธ Design****๐งช Verified**
- keep those sections flat, concise, and easy to scan
- keep prose brutally short; prefer bullets over paragraphs here
- if browser verification applies,
**๐ Browser Check**must include the exact human steps to verify the fix in the browser **๐๏ธ Design**is mandatory for any non-trivial code-changing task and must include:Chosen seam: ...Why not quick patch: ...Why not broader change: ...only if a broader API or abstraction change was a real option
UI Or Browser Tasks
- Include at least one real browser proof screenshot in the final response.
- The screenshot must come from
dev-browseror the real browser workflow used for verification. - When
**๐ Browser Check**is present, put the screenshot immediately after that section. - Otherwise, put the screenshot immediately after the metadata lines + flow table, before the completion summary.
- If no real browser proof exists, the task is not done unless the user explicitly waived it.
- If
dev-browseris blocked on a likely reusable tool-side issue and the product task is still otherwise fixable, loadagent-browser-issue. - If that follow-up issue is opened, mention it in the caveat or handoff.
**๐ Browser Check**must be a flat bullet list:- keep it short and concrete
- prefer the exact local URL as a markdown link when one exists, for example http://localhost:3000/...
- use the real page, route, and interaction names
- focus on how to prove the fix, not on implementation detail
Non-UI Tasks
- No screenshot is required.
- Omit
**๐ Browser Check**when there is no browser surface. - Keep the final response concise and evidence-based.
Testing Or Batch Work
- State which slice, bucket, or hotspot was completed.
- State what verification ran for that completed slice.
- State what remains queued by design.
- State any intentional deferral in generic terms only.
Review Or Investigation Tasks
- Lead with findings or conclusion, not process notes.
Post Back To Tracker
Apply this section only when the task came from a tracker item and reached a meaningful outcome.
Pull Request
- When a PR exists, the PR description must match the exact current final handoff from chat:
- same flow table
- same metadata lines, except omit the leading
๐ PR ...line because the PR page already identifies itself - same screenshot when applicable
- same
**โ Outcome**,**๐๏ธ Design**,**๐งช Verified**,**โ ๏ธ Caveat**, and**๐ Browser Check**sections when applicable - same caveats
- same human browser verification steps when applicable
- PR description follows the same writing style:
- extremely concise
- grammar can be sacrificed for concision
- no fluffy framing
- Do not publish a PR description with a dead local proof path.
- If screenshot proof is local, upload it first and only then write or update the final PR body with the hosted GitHub URL.
- For a brand-new PR, if the hosted proof URL is not ready yet:
- create the PR with no proof image
- upload the image immediately
- replace the placeholder with the real hosted proof before handoff
- If the PR description includes a local image path for proof, do not leave it that way on GitHub.
- Use
dev-browser --connect http://127.0.0.1:9222to upload the image through the PR comment file input as a staging area, then replace the local proof path in the PR body with the hosted GitHub attachment URL. - Use the PR comment textarea only as staging:
- upload image
- read generated markdown or URL from the textarea
- clear the textarea
- do not submit the staging comment
- Do not spend time reloading the PR page just to verify hosted image rendering unless the user explicitly asks.
- Do not compress, adapt, or rewrite the handoff for the PR description.
- If a later iteration changes the fix, evidence, certainty, caveats, or verification, update the PR description again so it stays in sync.
- Update the PR description before writing the issue comment.
GitHub Issue
- Use:
gh issue comment <number-or-url> --body-file - - Keep it user-facing and e2e-centric.
- Keep it extremely concise. Grammar can be sacrificed for concision.
- Start with a single plain line in this shape:
โ Fixed in #<pr-number>.
- Follow with a flat bullet list only:
- manual verification steps for the real user flow
- optional final QA or caveat bullet when useful
- Do not mention:
- Codex
- file names
- tests, typecheck, or lint
- screenshot paths
- branch names
- commit, push, or staging mechanics
- Do not write the issue comment before the PR exists.
- If writing the comment after code-changing work, use the PR number form
#123, not the full URL.
Example:
โ
Fixed in #123.
- Open the affected page.
- Follow the real user flow that triggered the bug.
- Confirm the fixed behavior in the browser.
- Manual QA is still useful if browser verification was partial.
Linear
- Keep proof local. Do not upload screenshots or other files to Linear.
- Post a concise issue comment using the Linear integration.
- Match the ticket style and write for QA, not developers.
- Keep the same terse style:
- extremely concise
- grammar can be sacrificed for concision
- avoid file names, tests, typecheck, lint, branch names, PR mechanics, and screenshot paths
Blocked Or Inconclusive Tracker Work
- If the task was blocked before meaningful progress, either skip the comment or post a short blocker note only if that helps the tracker owner.
Success Criteria
- Source-of-truth context was read first.
- Relevant local instructions and code patterns were read before editing.
- Tracker items were fetched and summarized correctly when provided.
- Video attachments or screen recordings were turned into normalized
<video-transcripts>evidence before implementation when tracker evidence required it. - Bare GitHub issues like
#555were resolved against the currentghrepo instead of guessed. - The chosen implementation fixed the highest-leverage seam available, not just the nearest symptom.
- Code-changing tasks that did not already start on a branch clearly dedicated to the same task checked out
mainand pulled latest before branching. - Non-trivial work loaded
planning-with-filesor the repo-equivalent planning workflow before implementation. - Non-trivial package or registry work recorded release-artifact requirements in the plan early: changeset for published package work,
docs/components/changelog.mdxfor registry-only work. - Testing work loaded the testing policy before implementation.
- Only the necessary skills were loaded.
- The implementation matched the task type instead of following a one-size-fits-all ritual.
- Batch work did not sprawl across multiple slices without explicit instruction.
- Verification matched the change scope.
- Final handoff matched the task type.
- Testing or batch handoff reported the completed slice, verification, and remaining queue when relevant.
- Any tracker, browser, review, or compound follow-up was done only if actually relevant.