workflow-builder
Workflow Builder ποΈ
The meta-skill for designing and building autonomous OpenClaw workflows. A workflow (steward) is an autonomous agent that runs on a schedule, maintains state, learns over time, and does real work without prompting.
Skills vs Workflows:
- Skill = single-purpose tool (how to use a CLI, API, or pattern)
- Workflow = autonomous agent with state, learning, and scheduling
Part 1: Should You Automate This?
Not everything deserves a workflow. Use this framework to decide.
The Automation Audit
For any candidate task, score these dimensions:
| Dimension | Question | Score |
|---|---|---|
| Frequency | How often? (daily=3, weekly=2, monthly=1, rare=0) | 0-3 |
| Repetitiveness | Same steps every time? (always=3, mostly=2, sometimes=1, never=0) | 0-3 |
| Judgment Required | Needs creative thinking? (none=3, low=2, medium=1, high=0) | 0-3 |
| Time Cost | Minutes per occurrence Γ frequency per month / 60 = hours/month | raw |
| Safety | How safe to automate? (harmless if wrong=3, annoying=2, costly=1, dangerous=0) | 0-3 |
Decision:
- Score β₯ 10 + Time Cost > 2 hrs/month β Build a workflow
- Score 7-9 β Add to heartbeat checklist (batch with other checks)
- Score < 7 β Keep manual or add as a cron one-liner
ROI Calculator
Setup Time (hours) Γ $50 = Setup Cost
Time Saved (hours/month) Γ $50 = Monthly Value
Payback = Setup Cost / Monthly Value
< 2 months payback β Build it now
2-6 months β Build when you have time
> 6 months β Probably not worth it
Workflow vs Heartbeat vs Cron
| Approach | When to Use |
|---|---|
| Workflow (steward) | Needs state, learning, rules, multi-step processing |
| Heartbeat item | Quick check, batch with others, context-aware |
| Cron (isolated) | Exact timing, standalone, different model |
| Cron (main) | One-shot reminder, system event injection |
Rule of thumb: If it needs rules.md and agent_notes.md, it's a workflow. If it's
a 2-line check, add it to HEARTBEAT.md.
Part 2: Workflow Anatomy
Every workflow follows this structure:
workflows/<name>/
βββ AGENT.md # The algorithm (updates with openclaw-config)
βββ rules.md # User preferences (never overwritten by updates)
βββ agent_notes.md # Learned patterns (grows over time, optional for some types)
βββ state/ # Continuation state for multi-step work (optional)
β βββ active-work.json
βββ logs/ # Execution history (auto-pruned)
βββ YYYY-MM-DD.md
AGENT.md β The Algorithm
This is the workflow's brain. It ships with openclaw-config and can be updated.
Standard sections (adapt to your workflow β not all are required):
---
name: <workflow-name>
version: <semver>
description: <one-line description>
---
# <Workflow Name>
<One paragraph: what this workflow does and why it exists.>
## Prerequisites
<What tools/access/labels/setup are needed before first run.>
## First Run β Setup Interview
<Interactive setup that creates rules.md. Ask preferences, scan existing data, suggest
smart defaults. Always let the user skip/bail early.>
## Regular Operation
<The main loop: what to read, how to process, when to alert, what to log.>
## Housekeeping
<Daily/weekly maintenance: log pruning, data cleanup, self-audit.>
rules.md β User Preferences
Created during first-run setup interview. Never overwritten by updates.
Pattern:
# <Workflow> Rules
## Account
- account: user@example.com
- alert_channel: whatsapp (or: none, telegram, slack)
## Preferences
- <workflow-specific settings>
## VIPs / Exceptions
- <people or patterns to handle specially>
agent_notes.md β Learned Patterns
The workflow writes here as it learns. Accumulates over time.
Pattern:
# Agent Notes
## Patterns Observed
- <sender X always sends receipts on Fridays>
- <task type Y usually takes 2 hours>
## Mistakes Made
- <once archived an important email β now check for X before archiving>
## Optimizations
- <batch processing senders A, B, C saves 3 API calls>
logs/ β Execution History
One file per day, auto-pruned after 30 days.
Pattern:
# <Workflow> Log β YYYY-MM-DD
## Run: HH:MM
- Processed: N items
- Actions: archived X, deleted Y, alerted on Z
- Errors: none
- Duration: ~Ns
Part 3: Design Patterns
Pattern 1: Setup Interview
Every workflow should start with an interactive setup that creates rules.md.
Best practices:
- Check prerequisites first (API access, labels, etc.)
- Ask questions one category at a time
- Offer smart defaults based on scanning existing data
- Let the user skip or bail early ("looks good, skip to the end")
- Summarize rules in plain language before saving
- Always include an escape hatch:
alert_channel: none
Pattern 2: Graduated Trust
Start conservative, get more aggressive as confidence grows.
Week 1: Only act on obvious items (>95% confidence)
Week 2: Expand to likely items (>85% confidence), log edge cases
Week 3: Review agent_notes.md, adjust thresholds
Week 4+: Stable operation with periodic self-audit
Write confidence thresholds to rules.md so the user can tune them.
Pattern 3: Sub-Agent Orchestration
Match intelligence to task complexity, and always use sub-agents for loops.
Rule: Never Loop Over Collections in the Orchestrator
Any time you iterate over a list (contacts, emails, tasks, records), spawn a sub-agent per item. This preserves the parent context for coordination and prevents pollution.
Pattern:
Orchestrator (parent):
1. Fetch the list (from API, file, database)
2. Query tracking state to filter already-processed items
3. FOR EACH new item: Spawn a sub-agent with that item's details
4. Sub-agent processes one item, returns structured result
5. Parent collects results, updates tracking state, alerts if needed
Sub-agent:
- Receives: One item + context needed for that item
- Does: All the reasoning, decision-making, work
- Returns: Structured summary (status, action taken, errors, alerts)
- Never accesses parent's full context
Why: Each sub-agent gets a fresh context window. Parent stays clean for orchestration logic. No pollution from per-item reasoning.
Model Selection: Check-Work Tiering for High-Frequency Jobs
For jobs running every few minutes (e.g., every 5 min, every 15 min):
Two-stage pattern:
Stage 1 (Cheap): Use Haiku to ask "Is there any work to do?"
- Cheap to run often
- Quick predicate check (yes/no)
- Examples: "Any new emails?", "Any cron job failures?", "Any security alerts?"
Stage 2 (Expensive): If yes, spawn Opus/Sonnet to do the actual work
- Only spawned when there's real work
- Has full context for reasoning/decisions
- Saves tokens on empty runs
Example:
Cron job runs every 5 minutes:
1. Haiku runs: "Are there any unprocessed emails in my inbox?"
β Returns boolean (with brief explanation)
2. If yes: Spawn Sonnet to "Process and categorize these 3 emails"
β Does the actual work
3. If no: Skip expensive processing, return early
β Save ~90% tokens on empty runs
Model selection for different complexities:
High-frequency checks (every 5-15 min) β Haiku to check, Sonnet/Opus to act
Obvious/routine items β Spawn sub-agent (cheaper model: Sonnet)
Important/nuanced items β Handle yourself or spawn a powerful sub-agent (Opus)
Quality verification β Can use a strong model as QA reviewer (Opus as sub-agent)
Uncertain items β Sub-agents escalate to you rather than guessing
Note: Don't hardcode model IDs (they go stale fast). Use aliases like sonnet,
opus, haiku or reference the model by capability level.
Pattern 4: State Externalization β Contextual State vs Tracking State
Critical: Chat history is a cache, not the source of truth. After every meaningful step, write state to disk. But distinguish between two types:
4a. Contextual State (Markdown only)
What: Information the agent reasons about or learns over time. Examples:
agent_notes.md, rules.md, daily logs, decision summaries. Format: Markdown.
Always human-readable. Why markdown: These belong in context so the agent can reason
about them.
# agent_notes.md
## Patterns Observed
- Contact X always sends updates on Tuesdays
- Task type Y typically needs 2-hour blocks
## Mistakes Made
- Once skipped important sender β now review sender importance before filtering
4b. Tracking State (SQLite only)
What: Deduplication, "have I seen this?", processed IDs, state queries.
Examples: processed.db with tables for seen IDs, statuses, timestamps. Format:
SQLite database with structured queries. Why SQLite: The agent doesn't reason about
this β it only queries it. SQLite gives O(1) lookups without loading the entire history
into context.
β οΈ NEVER use JSON for state files. You are an LLM, not a JSON parser. JSON is useful for API responses and tool output flags, but state files should be markdown (human-readable) or SQLite (queryable). JSON state files create noise, parsing errors, and waste context on structure rather than content.
The workflow's db-setup.md defines the specific schema. The calling LLM writes the SQL
β don't over-prescribe queries in AGENT.md. Just describe what should happen (e.g.,
"check if already processed", "mark as classified", "clean up entries older than 90
days") and let the LLM write the appropriate queries.
Schema Versioning & Migration
Every workflow that uses SQLite should track schema versions using SQLite's built-in
PRAGMA user_version (an integer stored in the database header β no extra tables):
- Put the schema inline in AGENT.md β the LLM needs it to write queries anyway
- Declare the expected version (e.g.,
PRAGMA user_version: 1) - Each run checks:
PRAGMA user_version- Matches β proceed
- Lower or missing β create tables / apply migrations / set user_version
- If legacy state files exist (e.g.,
processed.md), migrate entries and archive
See workflows/contact-steward/AGENT.md for a reference implementation.
Rule in AGENT.md: "On every run, read contextual state first (agent_notes.md, rules.md). Query tracking state via SQLite β one version check, then targeted queries. After processing, update both as needed. Never load tracking history into context."
Pattern 5: Error Handling & Alerting
Every workflow must handle failures gracefully:
- Log errors to daily log with full context
- Alert on critical failures (unless
alert_channel: none) - Never fail silently β if something breaks, the human should know
- Quarantine, don't destroy β use labels/tags, not deletion
- Route all errors to one place β consistent error channel
Pattern 6: Integration Points
Workflows should declare how they connect to other workflows:
## Integration Points
### Receives From
- email-steward: Emails needing follow-up β creates task
### Sends To
- task-steward: Creates tasks when work is discovered
- message channel: Alerts when human attention needed
### Shared State
- None (or: reads from workflows/shared/contacts.md)
Part 4: Scheduling & Execution
How Workflows Run
Workflows are triggered by cron jobs (isolated sessions):
# Example: email steward runs every 30 minutes during business hours
openclaw cron add \
--name "Email Steward" \
--cron "*/30 8-22 * * *" \
--tz "YOUR_TIMEZONE" \
--session isolated \
--message "Run email steward workflow. Read workflows/email-steward/AGENT.md and follow it." \
--model sonnet \
--announce
Cron Configuration Guidelines
| Workflow Type | Schedule | Model Pattern | Session |
|---|---|---|---|
| High-frequency checks (every 5-15 min) | Every 5-15 min | Haiku (check) β Sonnet (act) | Isolated |
| High-frequency triage (email, notifications) | Every 15-30 min | Sonnet | Isolated |
| Daily reports/summaries | Once daily at fixed time | Opus | Isolated |
| Weekly reviews/audits | Weekly cron | Opus + thinking | Isolated |
| Reactive (triggered by events) | Via webhook or system event | Varies | Isolated |
Note on Check-Work Tiering:
- If a job runs multiple times per hour, use the two-stage pattern: cheap check (Haiku) β expensive work (Sonnet/Opus)
- This cuts token costs on empty runs (when there's no work to do)
- Example: "Email arrived?" (Haiku) β "Process these 5 emails" (Sonnet) only if yes
- Apply to: health checks, inbox scans, notification monitors, cron job monitors
Delivery
- Routine runs: Omit
--announce(or set delivery tonone) β work silently, only alert when something needs attention - Reports/summaries: Use
--announceβ delivers a summary to the configured channel after completion - Errors/alerts: Always deliver via the workflow's configured alert channel
Note: Isolated cron jobs default to announce delivery (summary posted after run).
Set delivery: none explicitly if you want silent operation.
Part 5: Building a New Workflow
Step-by-Step Process
- Identify the opportunity (use the Automation Audit above)
- Define the scope β What does "done" look like for one run?
- List prerequisites β What tools, access, labels are needed?
- Design the setup interview β What preferences does the user need to set?
- Write AGENT.md β The algorithm, following the anatomy above
- Test manually β Run the AGENT.md instructions yourself first
- Set up cron β Schedule for autonomous operation
- Monitor first week β Watch logs, tune rules, build agent_notes
AGENT.md Template
---
name: <name>-steward
version: 0.1.0
description: <one-line description>
---
# <Name> Steward
<What this workflow does and why.>
## Prerequisites
- **<Tool>** configured with <access>
- **<Labels/tags>** created: <list>
- **Alert channel** configured (or none)
## First Run β Setup Interview
If `rules.md` doesn't exist or is empty:
### 0. Prerequisites Check
<Verify all tools and access work.>
### 1. Basics
<Core configuration questions.>
### 2. Preferences
<How aggressive, what to touch, what to skip.>
### 3. Data Scan (Optional)
<Offer to scan existing data and suggest rules.>
### 4. Alert Preferences
<What triggers alerts vs silent processing.>
### 5. Confirm & Save
<Summarize in plain language, save rules.md.>
## Database (only if this workflow tracks processed items)
**PRAGMA user_version: 1**
<Schema definition inline β CREATE TABLE, indexes, column descriptions.> <Setup &
migration instructions β what to do if database is missing, version is lower, or legacy
state files exist.>
## Regular Operation
### Your Tools
<List all tools/commands the workflow uses.>
### Each Run
1. Read `rules.md` for preferences
2. Read `agent_notes.md` for learned patterns (if exists)
3. Ensure database is ready (see Database section β one quick version check)
4. <Scan/fetch new items>
5. Query `processed.db` to filter items already handled
6. FOR EACH new item: Spawn a sub-agent to process it (see Sub-Agent Orchestration)
7. After each item, update `processed.db` with status
8. Collect sub-agent results
9. Alert if anything needs attention
10. Append to today's log in `logs/`
11. Update `agent_notes.md` if you learned something new about patterns/mistakes
### Judgment Guidelines
<When to act vs leave alone. Confidence thresholds.>
## Housekeeping
- Delete logs older than 30 days
- <Any other periodic cleanup>
## Integration Points
<How this connects to other workflows.>
Checklist Before Deploying
- AGENT.md follows the standard anatomy
- Setup interview creates rules.md with all needed preferences
- Has clear judgment guidelines (when to act vs leave alone)
- Error handling: logs errors, alerts on critical failures
- Tracking state: If workflow queries "have I seen this?", uses
processed.db(SQLite), not markdown lists - Sub-agents: Any loop over a collection spawns sub-agents per item, not in orchestrator
- Contextual state: agent_notes.md and rules.md are markdown, not JSON
- Housekeeping: auto-prunes old logs and cleans up stale tracking entries (e.g.,
DELETE FROM processed WHERE last_checked < ...) - Integration points documented
- Cron job configured with appropriate schedule/model
- First week monitoring plan in place
Part 6: Maintaining Workflows
Monthly Audit (15 min per workflow)
For each active workflow:
- Review logs β Any recurring errors? Silent failures?
- Check agent_notes.md β Has it learned useful patterns?
- Review rules.md β Still accurate? Preferences changed?
- ROI check β Still saving time? Worth the token cost?
- Integration health β Connected workflows still working?
When to Retire a Workflow
- ROI drops below 1x (costs more than it saves)
- The underlying process changed significantly
- A better approach exists (new tool, API, or workflow)
- It causes more problems than it solves
To retire: disable the cron job, archive the workflow directory, note in
memory/decisions/.
Part 7: Security Considerations
For Workflows from ClawHub
β οΈ ClawHub has had malicious skills. Before installing any workflow:
- Inspect before installing:
npx clawhub inspect <slug> --files - Check for VirusTotal flags: ClawHub scans automatically; heed warnings
- Download to /tmp for review:
npx clawhub install <slug> --dir /tmp/review - Review all files manually β look for:
- External API calls to unknown domains
- Eval/exec of dynamic code
- Hardcoded API keys or crypto addresses
- Instructions to disable safety features
- Data exfiltration patterns (sending data to external services)
- Never install directly into your workspace without review
For Your Own Workflows
- Workflows should only access tools they need
- Alert channels should be explicit (no silent external sends)
- Quarantine before delete (labels > trash > permanent deletion)
- Log all actions for auditability
Existing Workflows Reference
email-steward
- Purpose: Inbox debris removal
- Schedule: Configured via cron (typically every 30 min during business hours)
- Tools: gog CLI (Gmail)
- Key pattern: Setup interview β graduated trust β sub-agent delegation
- Notable: Uses
agent_notes.mdheavily for learning sender patterns
task-steward
- Purpose: Task board management with QA verification
- Schedule: Can run via heartbeat or cron (see its AGENT.md for guidance)
- Tools: Asana MCP
- Key pattern: Task classification β work execution β quality gate (Opus QA) β delivery
- Notable: Spawns Opus as QA sub-agent β demonstrates strong model as worker, not just orchestrator
More from technickai/openclaw-config
openclaw
Install, configure, and update openclaw-config
17parallel
Web search, content extraction, deep research, data enrichment, entity discovery, and
2quo
Quo (formerly OpenPhone) business phone system β check calls, texts, voicemails,
2asana
Manage Asana tasks, projects, and workspaces via MCP
2smart-delegation
>
2fireflies
Query Fireflies.ai meeting transcripts - summaries, action items, and searchable
2