deep-learning-experiment-workflow-skill
Deep Learning Experiment Workflow Skill
Overview
Run a staged workflow for deep-learning work where the hard parts are usually investigation quality, experiment definition, and empirical validation rather than large implementation volume. Use this skill for tasks such as model training, fine-tuning, architecture changes, loss-function changes, data-pipeline changes, ablations, benchmark comparisons, and reproducible evaluation work.
This workflow is stage-gated. Do not batch-generate all artifacts by default. Advance only when the current stage gate is satisfied or a classified re-entry path says otherwise.
Skill Layout
SKILL.mdis the workflow router.shared/workflow-state-template.mdis the canonical stage-control artifact.stages/stores stage-owned guides and templates:stages/00-bootstrap/stages/01-investigation/stages/02-requirements-and-success-criteria/stages/03-experiment-plan/stages/04-implementation/stages/05-training-validation/stages/06-code-review/stages/07-docs-sync/stages/08-handoff/
Workflow
Ticket Folder Convention
- For each task, create or reuse one ticket folder under
tickets/in-progress/. - Write active workflow artifacts in
tickets/in-progress/<ticket-name>/. - Archive completed tickets in
tickets/done/<ticket-name>/. - Move a ticket to
doneonly after explicit user verification or explicit user instruction. - If the user reopens a completed task, move the ticket back to
tickets/in-progress/<ticket-name>/before new updates.
Bootstrap And Worktree Setup
- Before investigation, create or reuse the ticket folder and write
requirements.mdwith statusDraft. - If the project is a git repository:
- resolve the base branch from explicit user instruction when provided, otherwise infer the tracked remote default or integration branch with highest confidence,
- refresh tracked remote refs before creating a new ticket branch or worktree,
- create or reuse a dedicated ticket worktree,
- create or reuse a ticket branch named
codex/<ticket-name>.
- If the environment is not a git repository, continue without worktree setup and still enforce the ticket-folder and
Draftrequirement capture.
Workflow State File
- Create and maintain
tickets/in-progress/<ticket-name>/workflow-state.mdas the mandatory stage-control artifact. - Initialize it during Stage 0 with:
Current Stage = 0Code Edit Permission = Locked- the bootstrap record filled in
- stage gates set to
Not StartedorIn Progress
- Update
workflow-state.mdon every stage transition, gate decision, and re-entry declaration.
Source-Edit Lock Rule
- No source-code edits are allowed unless
workflow-state.mdshows:Current Stage = 4Code Edit Permission = Unlocked
- Default state is
Locked. - Unlock source-code edits only after Stage 3
Experiment Planis current enough to drive implementation. - If Stage 5, 6, or 7 fails and a re-entry is required, lock source edits before taking the return path.
Canonical Flow
- Forward path:
0 -> 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 - Re-entry is mandatory when failures show the issue is upstream of the current stage.
- Do not stop after recording a re-entry path; resume work in the returned stage immediately unless blocked by the environment or waiting for an explicit user-only decision.
Stage Router
0) Bootstrap
- Primary files:
stages/00-bootstrap/README.mdstages/00-bootstrap/bootstrap-checklist.md
- Required outcome:
- ticket context exists,
requirements.mdexists with statusDraft,workflow-state.mdexists and records bootstrap details.
1) Investigation
- Primary files:
stages/01-investigation/README.mdstages/01-investigation/investigation-guide.mdstages/01-investigation/investigation-notes-template.md
- Investigation is first-class in this workflow.
- Investigation can include:
- reading local code, configs, logs, checkpoints, and datasets,
- reading open-source repositories and relevant documentation,
- checking papers or model references when needed,
- running probes, small scripts, reproductions, and data sanity checks.
- Required outcome:
investigation-notes.mdis a durable dossier with concrete evidence,- the task is triaged for scope and uncertainty,
- later stages can reuse the findings directly.
2) Requirements & Success Criteria
- Primary files:
stages/02-requirements-and-success-criteria/README.mdstages/02-requirements-and-success-criteria/requirements-success-criteria-guide.md
- Required outcome:
requirements.mdmoves fromDrafttoPlan-readyorRefined,- task definition, baseline, metrics, thresholds, constraints, and success criteria are explicit,
- the planned validation gate can measure pass, fail, or inconclusive results truthfully.
3) Experiment Plan
- Primary files:
stages/03-experiment-plan/README.mdstages/03-experiment-plan/experiment-plan-template.md
- This stage replaces heavy software-architecture runtime modeling.
- Focus on:
- chosen hypothesis and rationale,
- model or algorithm changes,
- data and split assumptions,
- loss, optimizer, scheduler, and training recipe,
- evaluation protocol,
- ablations or comparison matrix,
- reproducibility plan,
- implementation work items.
- Required outcome:
experiment-plan.mdis current and can drive Stage 4 implementation and Stage 5 training or evaluation.
4) Implementation
- Primary files:
stages/04-implementation/README.mdstages/04-implementation/implementation-template.md
- Implementation is important, but it is not the center of this workflow.
- Keep the artifact execution-oriented:
- changed files,
- config updates,
- commands,
- checkpoints and logging paths,
- smoke checks,
- readiness for training and validation.
- Required outcome:
- implementation matches the experiment plan closely enough to run Stage 5,
- source edits are complete for the current iteration,
- smoke or unit checks needed before training are complete.
5) Training & Validation
- Primary files:
stages/05-training-validation/README.mdstages/05-training-validation/training-validation-guide.mdstages/05-training-validation/training-validation-template.md
- This is the primary evidence gate of the workflow.
- Record actual empirical evidence, not only intent:
- run configuration,
- commit or diff basis,
- seed,
- data version or split,
- hardware or environment,
- checkpoints,
- metrics,
- baseline comparison,
- failure analysis,
- pass, fail, or inconclusive decision.
- Required outcome:
training-validation-report.mdtruthfully closes the current success criteria,- blocked or infeasible cases are explicitly recorded,
- the next action is clear.
6) Code Review
- Primary files:
stages/06-code-review/README.mdstages/06-code-review/code-review-guide.mdstages/06-code-review/code-review-template.md
- Run code review only after Stage 5 evidence is current.
- Review focus for deep-learning work includes:
- data leakage,
- train or eval mode mistakes,
- metric correctness,
- label and mask alignment,
- checkpoint and config semantics,
- numerical stability,
- reproducibility gaps,
- logging and artifact traceability.
- Required outcome:
code-review.mdrecords a clear gate decision and any required re-entry classification.
7) Docs Sync
- Primary files:
stages/07-docs-sync/README.mdstages/07-docs-sync/docs-sync-guide.mdstages/07-docs-sync/docs-sync-template.md
- Update durable docs only after the current implementation and validation story is truthful.
- Typical sync targets:
- training commands,
- config assumptions,
- dataset or split expectations,
- best-known run summary,
- reproduction notes,
- important caveats.
8) Handoff
- Primary files:
stages/08-handoff/README.mdstages/08-handoff/handoff-guide.mdstages/08-handoff/handoff-summary-template.md
- Finish with:
- a clear summary of what changed,
- best run or best evidence,
- open risks and next experiments,
- explicit user verification,
- ticket archival and repository finalization when applicable.
Re-Entry Model
Use classified re-entry when a later stage proves the issue is upstream:
Local Fix: the current iteration can be corrected by revisiting implementation directly.Validation Gap: Stage 6 lacks enough Stage 5 evidence; return to5 -> 6.Plan Impact: the experiment plan is no longer sound enough; return through Stage 3 before more implementation.Requirement Gap: success criteria or scope were incomplete or wrong; return through Stage 2.Investigation Gap: the evidence base is insufficient; return through Stage 1.Unclear: root cause is still uncertain or cross-cutting; reopen from Stage 0 controls and rerun the chain.
Use the transition matrix in shared/workflow-state-template.md as the canonical reference for gate behavior.
More from autobyteus/autobyteus-skills
infographic-powerpoint-deck
Create image-based PowerPoint decks by (1) turning raw article content or notes into a detailed per-slide message plan when needed, (2) turning that message plan into a slide display plan and then a visual-production plan, (3) generating one 16:9 slide image per slide with all displayed text baked into the image (English by default; multilingual slide text supported), and (4) assembling an images-only .pptx that simply concatenates those images full-screen. Use when the user wants polished, consistent visuals with extensible style packs (cinematic dark, cinematic light, cinematic editorial, illustrative cinematic, animated feature, editorial, warm pastoral, tech, youth social, academic, corporate, whiteboard sketch), prefers not to hand-layout PPT objects, or wants a repeatable prompt workflow to iterate over time.
145software-engineering-workflow-skill
Run a staged software-engineering delivery feedback loop from bootstrap through investigation, requirements, design, runtime review, implementation, API/E2E and executable validation, code review, docs sync, and final handoff with durable artifacts and explicit re-entry.
55product-ui-prototyping
Design and validate product UI behavior as visual state prototypes before coding. Use when tasks ask how screens should change after user actions (click, tap, submit), when non-developers need to review web/iOS/Android UX flows, or when teams need interaction-state assets and acceptance checks for implementation.
54computer-use-playbook
Use when tasks involve cross-application computer use (browser, file explorer, and native dialogs) and require choosing between DOM, vision, shell, and native UI automation.
28deep-research-article
Do deep research and synthesize it into one logically structured article with clear thesis, argument flow, evidence, objections, and takeaways. By default, this skill requires internet source collection and deep reading before drafting. Use when the user wants a strong reasoning artifact first: sermon notes, Bible passage study, policy/tech explainers, product narratives, or any topic where downstream artifacts should start from an approved article.
24ux-journey-definition
Define a practical, story-first product experience before UI prototyping. Use when you need one canonical artifact that explains what users see, what they can do, and how screens transition.
19