airflow
SKILL.md
Airflow
Apache Airflow is the standard for data engineering pipelines. v3.0 (2025) introduces Event-driven Triggers and a modern React UI.
When to Use
- ETL/ELT: Scheduling nightly data warehouse loads.
- ML Ops: Retraining models when new data arrives.
- Dependency Management: "Run Task B only if Task A succeeds".
Core Concepts
DAGs (Directed Acyclic Graphs)
Defined in Python.
Task SDK
New in v3.0. Allows writing tasks in any language, not just Python.
Edge Executor
Run tasks on remote edge devices.
Best Practices (2025)
Do:
- Use the TaskFlow API:
@taskdecorators are cleaner thanPythonOperator. - Use Datasets: Define data-aware scheduling (
schedule=[Dataset("s3://bucket/file")]).
Don't:
- Don't put top-level code in DAG files: It runs every scheduler heartbeat.
References
Weekly Installs
1
Repository
g1joshi/agent-skillsGitHub Stars
7
First Seen
Feb 10, 2026
Installed on
mcpjam1
claude-code1
replit1
junie1
windsurf1
zencoder1