workflow-rig
Workflow Rig
Use this skill as the canonical front door for agent-driven work in joelclaw.
The user should not have to choose between agent-workloads, restate-workflows, queue internals, or runtime trivia.
If the user says the magic words —
- "run this through the workflow rig"
- "dogfood this with the workflow rig"
- "kick this off"
- "start a canary"
- "run this as a workload"
— load this skill first.
What this skill owns
- workload shaping (
serial,parallel,chained) - choosing between inline work, durable runtime, or a pure handoff
joelclaw workload plan/dispatch/run- explicit stage DAGs via
--stages-from - honest runtime truth: what the restate-worker can do today, and what it still cannot do
- canary/dogfood posture for real workload proofs
Core rule
Intent first, substrate second.
The caller describes the work. The workflow rig chooses the narrowest honest execution path.
Do not push Redis vs Restate vs sandbox trivia back onto the caller unless that tradeoff is the actual decision.
Current proven state (as of 2026-03-17, session StubbornFerret)
joelclaw workload plan,joelclaw workload dispatch, andjoelclaw workload runare real.- Proven durable path:
joelclaw workload plan→ Redis queue → RestatedagOrchestrator→dagWorker→ execution. - Multi-stage DAGs with
dependsOnare proven across 3-5 stage pipelines. Downstream stages can consume earlier outputs via{{nodeId}}interpolation. --stages-from stages.jsonis proven: duplicate ids, unknown deps, self-deps, and cycles are rejected before runtime admission; critical path and phase grouping are calculated.shellhandler ✅ runs commands in the k8srestate-workerpod. Git clone, pi agent file writes, git commit, and git push are proven.inferhandler ✅ full pi agent with tools — read, edit, write, bash.pi -pmode enables all tools. It can read files, edit them, run commands, and produce working code changes. Not just text generation — full agent-style codegen.microvmhandler ⏸️ code works (one-shot exec model proven via bun test in pod) but ADR-0230 is paused — Colima nestedVirtualization crashes the VM under load. Don't use.- The
restate-workerimage is a full agent environment: pi 0.58.4, 76 skills, GitHub push auth from k8s secret, pi auth mounted from host (stays fresh). - Autonomous codegen proven: infer handler reads files + edits them via pi tools. Shell handler handles git clone/commit/push. Chain them in a DAG for full autonomous coding loops.
- Pre-cloned repo cache at
/app/repo-cache(~200ms copy vs ~3s clone).dagWorkeruses 15m inactivity timeout and 30m hard abort. - Retry caps: dagWorker maxAttempts=5, dagOrchestrator maxAttempts=3. No more infinite retry poisoning.
What does not work yet
microvmhandler paused (ADR-0230) — needs dedicated Linux hardware with native KVM- DAG completion notifications to the gateway not wired (operators poll or check OTEL)
- large-file pi agent edits can be slow (3-5 minutes) but succeed within the 15m timeout
Canonical operator flow
- Shape the work with
joelclaw workload plan - Present the shaped workload and ask approved?
- After approval:
- execute inline if it is bounded, local, and reversible
- use
joelclaw workload runfor real durable execution - use
joelclaw workload dispatchonly for a real baton pass
- If you enqueue runtime work, poll for progress with
joelclaw runs,joelclaw run <run-id>, or OTEL. There is no automatic completion ping yet. - Report outcome tersely: changed, verified, remaining, next move
Magic words → canonical commands
Plan only
joelclaw workload plan "<intent>" --repo <repo> [--paths a,b,c] [--stages-from <path>] [--write-plan <path>]
Real runtime canary / dogfood
joelclaw workload run <plan-artifact> \
[--stage <stage-id>] \
[--tool pi|codex|claude] \
[--timeout <seconds>] \
[--model <model>] \
[--execution-mode auto|host|sandbox] \
[--sandbox-backend local|k8s] \
[--sandbox-mode minimal|full] \
[--repo-url <git-url>] \
[--dry-run] \
[--skip-dep-check]
Handoff, not execution
joelclaw workload dispatch <plan-artifact> \
[--stage <stage-id>] \
[--project <mail-project>] \
[--from <agent>] \
[--to <agent>] \
[--send-mail] \
[--write-dispatch <path>]
Sandbox mode guidance
Use --sandbox-mode full when the proof needs real runtime surfaces:
- service/network lifecycle
- full environment materialization
- cleanup evidence
- anything where a minimal local sandbox would hide the real failure mode
Use --sandbox-mode minimal for cheap code/doc/test slices where full runtime provisioning is overkill.
Use --stages-from when you already have a real stage DAG. The planner preserves per-stage acceptance, validates dependencies/cycles, calculates critical path metadata, and keeps the DAG instead of collapsing it into template stages.
Use --skip-dep-check only for deliberate manual recovery. Normal joelclaw workload run blocks a stage until its explicit dependencies have terminal truth.
Do not use microvm — ADR-0230 is paused. Colima nestedVirtualization is unstable. Use shell + infer for all work.
Current runtime truth
joelclaw workload runis the real bridge from workload artifacts to runtime admission.- The durable path is Redis queue admission → Restate
dagOrchestrator→dagWorker. dagOrchestratorresolves dependency waves correctly for chained multi-stage DAGs.{{nodeId}}interpolation is proven for passing upstream outputs into downstream stages.- The
shellhandler is the proven path for git operations (clone, commit, push) and arbitrary CLI work. - The
inferhandler is a full pi agent — it reads, edits, and writes files via tools. Use it for codegen, analysis, and any task that benefits from LLM + file access. - Chain
infer(edit files) →shell(git commit + push) for autonomous coding loops. - The
microvmhandler is paused (ADR-0230). Do not use. - Completion is poll-based for now. No gateway finish event is emitted when a DAG lands.
Real chained example
This is an honest four-stage shape the rig can run today:
[
{
"id": "research",
"name": "Research current state",
"acceptance": ["Facts gathered"],
"executionMode": "manual"
},
{
"id": "plan",
"name": "Turn research into an execution plan",
"dependsOn": ["research"],
"acceptance": ["Implementation plan written"],
"executionMode": "manual"
},
{
"id": "implement",
"name": "Apply the change in the worker",
"dependsOn": ["plan"],
"acceptance": ["Requested files updated", "Commit pushed"],
"executionMode": "pi",
"notes": "Use {{plan}} as the downstream input."
},
{
"id": "verify",
"name": "Verify and summarize",
"dependsOn": ["implement"],
"acceptance": ["Verification captured", "Closeout ready"],
"executionMode": "manual",
"notes": "Use {{implement}} for verification context."
}
]
Run it through the front door:
joelclaw workload plan "Research, plan, implement, then verify the change" \
--repo ~/Code/joelhooks/joelclaw \
--stages-from stages.json \
--write-plan plan.json
When to reach for compatibility skills
agent-workloads— only when an older prompt already names it; treat it as a compatibility aliasrestate-workflows— only when the work is specifically about external-repo runtime bridging or low-level substrate contracts
Dogfood posture
When proving runtime work:
- prefer a canary first
- use the real front door (
joelclaw workload run), not hand-rolledsystem/agent.requested, unless you are debugging below the rig - capture honest evidence from queue admission, Restate,
dagWorker, and resulting git/verification artifacts - poll the run yourself; there is no completion event to the gateway yet
- inside a sandboxed stage run, do not launch another workflow-rig canary
- if the rig is broken, say the rig is broken; do not blame sandboxes or the gateway for a queue/worker failure
- if the task needs large-file agent edits, budget minutes, not seconds
Rules
- do not invent new workload vocabulary when
docs/workloads.mdalready defines it - do not force the operator to choose queue vs Restate vs sandbox when
joelclaw workload runis the right bridge - do not use
microvm— ADR-0230 paused - do not claim a dogfood proof succeeded unless the real runtime path moved and produced evidence
- do not imply automatic completion notifications exist when they do not
More from joelhooks/joelclaw
cli-design
Design and build agent-first CLIs with HATEOAS JSON responses, context-protecting output, and self-documenting command trees. Use when creating new CLI tools, adding commands to existing CLIs (joelclaw, slog), or reviewing CLI design for agent-friendliness. Triggers on 'build a CLI', 'add a command', 'CLI design', 'agent-friendly output', or any task involving command-line tool creation.
129k8s
>-
88docker-sandbox
Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution. Use when running agent loops, spawning tool subprocesses, or any task requiring process isolation. Triggers on "sandbox", "isolated execution", "docker sandbox", "safe agent execution", or when working on agent loop infrastructure.
86gateway
Operate the joelclaw gateway daemon — the always-on pi session that receives events, notifications, and messages. Use the joelclaw CLI for ALL gateway operations. Use when: 'restart gateway', 'gateway status', 'is gateway healthy', 'push to gateway', 'gateway not responding', 'telegram not working', 'messages not going through', 'gateway stuck', 'gateway debug', 'check gateway', 'drain queue', 'test gateway', 'stream events', or any task involving the gateway daemon.
40gateway-diagnose
Diagnose gateway failures by reading daemon logs, session transcripts, Redis state, and OTEL telemetry. Full Telegram path triage: daemon process → Redis channel → command queue → pi session → model API → Telegram delivery. Use when: 'gateway broken', 'telegram not working', 'why is gateway down', 'gateway not responding', 'check gateway logs', 'what happened to gateway', 'gateway diagnose', 'gateway errors', 'review gateway logs', 'fallback activated', 'gateway stuck', or any request to understand why the gateway failed. Distinct from the gateway skill (operations) — this skill is diagnostic.
40ui-animation
Guidelines and examples for UI motion and animation. Use when designing, implementing, or reviewing motion, easing, timing, reduced-motion behaviour, CSS transitions, keyframes, framer-motion, or spring animations.
33