data-engineer
Installation
SKILL.md
Data Engineer
Purpose
Provides expert data engineering capabilities for building scalable data pipelines, ETL/ELT workflows, data lakes, and data warehouses. Specializes in distributed data processing, stream processing, data quality, and modern data stack technologies (Airflow, dbt, Spark, Kafka) with focus on reliability and cost optimization.
When to Use
- Designing end-to-end data pipelines from source to consumption layer
- Implementing ETL/ELT workflows with error handling and data quality checks
- Building data lakes or data warehouses with optimal storage and querying
- Setting up real-time stream processing (Kafka, Flink, Kinesis)
- Optimizing data infrastructure costs (storage tiering, compute efficiency)
- Implementing data governance and compliance (GDPR, data lineage)
- Migrating legacy data systems to modern data platforms