integrations-index
Dagster Integrations Index
Navigate 82+ Dagster integrations organized by Dagster's official taxonomy. Find AI/ML tools, ETL platforms, data storage, compute services, BI tools, and monitoring integrations.
When to Use This Skill vs. Others
| If User Says... | Use This Skill/Command | Why |
|---|---|---|
| "which integration for X" | /dagster-integrations |
Need to discover appropriate integration |
| "does dagster support X" | /dagster-integrations |
Check integration availability |
| "snowflake vs bigquery" | /dagster-integrations |
Compare integrations in same category |
| "best practices for X" | /dagster-conventions |
Implementation patterns needed |
| "implement X integration" | /dg:prototype |
Ready to build with specific integration |
| "how do I use dbt" | /dagster-conventions (dbt section) |
dbt-specific implementation patterns |
| "make this code better" | /dignified-python |
Python code review needed |
| "create new project" | /dg:create-project |
Project initialization needed |
Quick Reference by Category
| Category | Count | Common Tools | Reference |
|---|---|---|---|
| AI & ML | 6 | OpenAI, Anthropic, MLflow, W&B | references/ai.md |
| ETL/ELT | 9 | dbt, Fivetran, Airbyte, PySpark | references/etl.md |
| Storage | 35+ | Snowflake, BigQuery, Postgres, DuckDB | references/storage.md |
| Compute | 15+ | AWS, Databricks, Spark, Docker, K8s | references/compute.md |
| BI & Visualization | 7 | Looker, Tableau, PowerBI, Sigma | references/bi.md |
| Monitoring | 3 | Datadog, Prometheus, Papertrail | references/monitoring.md |
| Alerting | 6 | Slack, PagerDuty, MS Teams, Twilio | references/alerting.md |
| Testing | 2 | Great Expectations, Pandera | references/testing.md |
| Other | 2+ | Pandas, Polars | references/other.md |
Category Taxonomy
This index aligns with Dagster's official documentation taxonomy from tags.yml:
- ai: Artificial intelligence and machine learning integrations (LLM APIs, experiment tracking)
- etl: Extract, transform, and load tools including data replication and transformation frameworks
- storage: Databases, data warehouses, object storage, and table formats
- compute: Cloud platforms, container orchestration, and distributed processing frameworks
- bi: Business intelligence and visualization platforms
- monitoring: Observability platforms and metrics systems for tracking performance
- alerting: Notification and incident management systems for pipeline alerts
- testing: Data quality validation and testing frameworks
- other: Miscellaneous integrations including DataFrame libraries
Note: Support levels (dagster-supported, community-supported) are shown inline in each integration entry.
Last verified: 2026-01-27
Finding the Right Integration
I need to...
Load data from external sources
- SaaS applications → ETL (Fivetran, Airbyte)
- Files/databases → ETL (dlt, Sling, Meltano)
- Cloud storage → Storage (S3, GCS, Azure Blob)
Transform data
- SQL transformations → ETL (dbt)
- Distributed transformations → ETL (PySpark)
- DataFrame operations → Other (Pandas, Polars)
- Large-scale processing → Compute (Spark, Dask, Ray)
Store data
- Cloud data warehouse → Storage (Snowflake, BigQuery, Redshift)
- Relational database → Storage (Postgres, MySQL)
- File/object storage → Storage (S3, GCS, Azure, LakeFS)
- Analytics database → Storage (DuckDB)
- Vector embeddings → Storage (Weaviate, Chroma, Qdrant)
Validate data quality
Run ML workloads
- LLM integration → AI (OpenAI, Anthropic, Gemini)
- Experiment tracking → AI (MLflow, W&B)
- Distributed training → Compute (Ray, Spark)
Execute computation
- Cloud compute → Compute (AWS, Azure, GCP, Databricks)
- Containers → Compute (Docker, Kubernetes)
- Distributed processing → Compute (Spark, Dask, Ray)
Monitor pipelines
- Team notifications → Alerting (Slack, MS Teams, PagerDuty)
- Metrics tracking → Monitoring (Datadog, Prometheus)
- Log aggregation → Monitoring (Papertrail)
Visualize data
Integration Categories
AI & ML
Artificial intelligence and machine learning platforms, including LLM APIs and experiment tracking.
Key integrations:
- OpenAI - GPT models and embeddings API
- Anthropic - Claude AI models
- Gemini - Google's multimodal AI
- MLflow - Experiment tracking and model registry
- Weights & Biases - ML experiment tracking
- NotDiamond - LLM routing and optimization
See references/ai.md for all AI/ML integrations.
ETL/ELT
Extract, transform, and load tools for data ingestion, transformation, and replication.
Key integrations:
- dbt - SQL-based transformation with automatic dependencies
- Fivetran - Automated SaaS data ingestion (component-based)
- Airbyte - Open-source ELT platform
- dlt - Python-based data loading (component-based)
- Sling - High-performance data replication (component-based)
- PySpark - Distributed data transformation
- Meltano - ELT for the modern data stack
See references/etl.md for all ETL/ELT integrations.
Storage
Data warehouses, databases, object storage, vector databases, and table formats.
Key integrations:
- Snowflake - Cloud data warehouse with IO managers
- BigQuery - Google's serverless data warehouse
- DuckDB - In-process SQL analytics
- Postgres - Open-source relational database
- Weaviate - Vector database for AI search
- Delta Lake - ACID transactions for data lakes
- DataHub - Metadata catalog and lineage
See references/storage.md for all storage integrations.
Compute
Cloud platforms, container orchestration, and distributed processing frameworks.
Key integrations:
- AWS - Cloud compute services (Glue, EMR, Lambda)
- Databricks - Unified analytics platform
- GCP - Google Cloud compute (Dataproc, Cloud Run)
- Spark - Distributed data processing engine
- Dask - Parallel computing framework
- Docker - Container execution with Pipes
- Kubernetes - Cloud-native orchestration
- Ray - Distributed computing for ML
See references/compute.md for all compute integrations.
BI & Visualization
Business intelligence and visualization platforms for analytics and reporting.
Key integrations:
- Looker - Google's BI platform
- Tableau - Interactive dashboards
- PowerBI - Microsoft's BI tool
- Sigma - Cloud analytics platform
- Hex - Collaborative notebooks
- Evidence - Markdown-based BI
- Cube - Semantic layer platform
See references/bi.md for all BI integrations.
Monitoring
Observability platforms and metrics systems for tracking pipeline performance.
Key integrations:
- Datadog - Comprehensive observability platform
- Prometheus - Time-series metrics collection
- Papertrail - Centralized log management
See references/monitoring.md for all monitoring integrations.
Alerting
Notification and incident management systems for pipeline alerts.
Key integrations:
- Slack - Team messaging and alerts
- PagerDuty - Incident management for on-call
- MS Teams - Microsoft Teams notifications
- Twilio - SMS and voice notifications
- Apprise - Universal notification platform
- DingTalk - Team communication for Asian markets
See references/alerting.md for all alerting integrations.
Testing
Data quality validation and testing frameworks for ensuring data reliability.
Key integrations:
- Great Expectations - Data validation with expectations
- Pandera - Statistical data validation for DataFrames
See references/testing.md for all testing integrations.
Other
Miscellaneous integrations including DataFrame libraries and utility tools.
Key integrations:
- Pandas - In-memory DataFrame library
- Polars - Fast DataFrame library with columnar storage
See references/other.md for other integrations.
References
Integration details are organized in the following files:
- AI & ML:
references/ai.md- AI and ML platforms, LLM APIs, experiment tracking - ETL/ELT:
references/etl.md- Data ingestion, transformation, and replication tools - Storage:
references/storage.md- Warehouses, databases, object storage, vector DBs - Compute:
references/compute.md- Cloud platforms, containers, distributed processing - BI & Visualization:
references/bi.md- Business intelligence and analytics platforms - Monitoring:
references/monitoring.md- Observability and metrics systems - Alerting:
references/alerting.md- Notifications and incident management - Testing:
references/testing.md- Data quality and validation frameworks - Other:
references/other.md- DataFrame libraries and miscellaneous tools
Using Integrations
Most Dagster integrations follow a common pattern:
-
Install the package:
pip install dagster-<integration> -
Import and configure a resource:
from dagster_<integration> import <Integration>Resource resource = <Integration>Resource( config_param=dg.EnvVar("ENV_VAR") ) -
Use in your assets:
@dg.asset def my_asset(integration: <Integration>Resource): # Use the integration pass
For component-based integrations (dbt, Fivetran, dlt, Sling), see the specific reference files for scaffolding and configuration patterns.