digdag
Treasure Workflow (Digdag)
Basic Structure
timezone: Asia/Tokyo
schedule:
daily>: 02:00:00
_export:
td:
database: my_database
engine: presto
+extract:
td>: queries/extract.sql
create_table: raw_data
+transform:
td>: queries/transform.sql
create_table: results
Key points:
.digextension required; filename becomes workflow name- Tasks run sequentially with
+task_name:prefix foo>: baris sugar for_type: fooand_command: bar
Session Variables
| Variable | Example |
|---|---|
${session_time} |
2024-01-30T00:00:00+09:00 |
${session_date} |
2024-01-30 |
${session_date_compact} |
20240130 |
${session_unixtime} |
1706540400 |
${last_session_date} |
Previous scheduled date |
${next_session_date} |
Next scheduled date |
Moment.js available:
+tomorrow:
echo>: ${moment(session_time).add(1, 'days').format("YYYY-MM-DD")}
TD Operator
+query:
td>: queries/analysis.sql
database: analytics
engine: presto
create_table: results # or insert_into: existing_table
Inline SQL:
+inline:
td>:
query: |
SELECT * FROM events
WHERE TD_TIME_RANGE(time, '${session_date}', TD_TIME_ADD('${session_date}', '1d'))
Parallel Execution
+parallel_tasks:
_parallel: true
+task_a:
td>: queries/a.sql
+task_b:
td>: queries/b.sql
+after_parallel:
echo>: "Runs after all parallel tasks"
Limited concurrency:
+limited:
_parallel:
limit: 2
Error Handling
+task:
td>: queries/important.sql
_retry: 3
_error:
+alert:
sh>: python scripts/alert.py "Task failed"
Retry with backoff:
+task:
_retry:
limit: 3
interval: 10
interval_type: exponential # or constant
Variables
_export:
td:
database: production
my_param: value
api_key: ${secret:api_credentials.key} # TD parameter store
+task:
py>: scripts.process.main
param: ${my_param}
Conditional & Loops
+check:
td>: queries/count.sql
store_last_results: true
+if_data:
if>: ${td.last_results.cnt > 0}
_do:
+process:
td>: queries/process.sql
+loop:
for_each>:
region: [US, EU, ASIA]
_do:
+process:
td>: queries/by_region.sql
Event Triggers
# Runs after another workflow succeeds
trigger:
attempt>:
dependent_workflow_name: segment_refresh
dependent_project_name: customer_segments
+activate:
td>: queries/activate.sql
tdx wf Commands
For full CLI reference, see tdx-skills/workflow. Key commands:
tdx wf pull my_project # Pull project to local folder
tdx wf push # Push local changes with diff preview
tdx wf run my_project.my_workflow # Run specific workflow
tdx wf sessions --status error # Find failed sessions
tdx wf timeline my_project.workflow # Visual task execution timeline
tdx wf attempt <id> logs +task_name # View task logs
Project Structure
workflows/
└── my_project/ # Created by tdx wf pull
├── tdx.json # Sync tracking (auto-generated)
├── main.dig # Workflow definition
├── queries/
│ └── analysis.sql
└── scripts/
└── process.py
Schedule Options
schedule:
daily>: 02:00:00
# hourly>: 00:00
# cron>: "0 */4 * * *"
# weekly>: "Mon,00:00:00"
Resources
More from treasure-data/td-skills
pytd
Expert assistance for using pytd (Python SDK) to query and import data with Treasure Data. Use this skill when users need help with Python-based data analysis, querying Presto/Hive, importing pandas DataFrames, bulk data uploads, or integrating TD with Python analytical workflows.
20tdx-basic
Executes tdx CLI commands for Treasure Data. Covers `tdx databases`, `tdx tables`, `tdx describe`, `tdx query`, `tdx auth setup`, context management with profiles/sessions, and output formats (JSON/TSV/table). Use when users need tdx command syntax, authentication setup, database/table exploration, schema inspection, or query execution.
3workflow
Manages TD workflows using `tdx wf` commands. Covers project sync (pull/push/clone), running workflows, monitoring sessions/attempts, task timeline visualization, retry/kill operations, and secrets management. Use when users need to manage, monitor, or debug Treasure Workflow projects via tdx CLI.
3journey
Load when the client wants to create, edit, or manage a CDP customer journey. Use for building journey YAML with segments, activations, and stage steps, modifying journey stages or flow logic (decision points, condition waits, A/B tests), or pushing journey changes to Treasure Data. Also load when the client wants to analyze journey performance, query journey tables, create journey dashboards, or generate journey action reports.
2parent-segment-analysis
Query and analyze CDP parent segment database data. Use `tdx ps desc -o` to get output database schema, then query customers and behavior tables. Use when exploring parent segment data, building reports, or analyzing customer attributes and behaviors.
2connector-config
Writes connector_config for segment/journey activations using `tdx connection schema <type>` to discover available fields. Use when configuring activations - always run schema command first to see connector-specific fields.
2