databricks-deploy-integration
SKILL.md
Databricks Deploy Integration
Overview
Deploy Databricks jobs and pipelines using Databricks Asset Bundles (DABs). Asset Bundles provide infrastructure-as-code for deploying jobs, notebooks, DLT pipelines, and ML models across workspaces with proper environment isolation and CI/CD integration.
Prerequisites
- Databricks CLI v0.200+ installed (
databrickscommand) - Workspace access with appropriate permissions
- Service principal for automated deployments
databricks.ymlbundle configuration
Instructions
Step 1: Initialize Asset Bundle
# Create new bundle from template
databricks bundle init
# Or manually create databricks.yml
# databricks.yml
bundle:
name: etl-pipeline
workspace:
host: https://myworkspace.cloud.databricks.com
resources:
jobs:
daily_etl:
name: daily-etl-${bundle.environment}
schedule:
quartz_cron_expression: "0 0 6 * * ?"
timezone_id: "America/New_York"
tasks:
- task_key: extract
notebook_task:
notebook_path: ./src/extract.py
new_cluster:
spark_version: "14.3.x-scala2.12"
node_type_id: "i3.xlarge"
num_workers: 2
- task_key: transform
depends_on:
- task_key: extract
notebook_task:
notebook_path: ./src/transform.py
environments:
development:
default: true
workspace:
host: https://dev.cloud.databricks.com
staging:
workspace:
host: https://staging.cloud.databricks.com
production:
workspace:
host: https://prod.cloud.databricks.com
Step 2: Deploy to Environment
# Validate bundle configuration
databricks bundle validate -e production
# Deploy resources (create/update jobs, notebooks)
databricks bundle deploy -e staging
# Run a specific job
databricks bundle run daily_etl -e staging
# Destroy resources in an environment
databricks bundle destroy -e development
Step 3: CI/CD Pipeline
# .github/workflows/deploy.yml
name: Deploy Databricks Bundle
on:
push:
branches: [main]
jobs:
deploy-staging:
runs-on: ubuntu-latest
environment: staging
steps:
- uses: actions/checkout@v4
- uses: databricks/setup-cli@main
- run: databricks bundle validate -e staging
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
- run: databricks bundle deploy -e staging
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
deploy-production:
needs: deploy-staging
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- uses: databricks/setup-cli@main
- run: databricks bundle deploy -e production
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_PROD_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_PROD_TOKEN }}
Step 4: Verify Deployment
# List deployed jobs
databricks jobs list --output json | jq '.[] | select(.settings.name | contains("etl"))'
# Check recent runs
databricks runs list --job-id $JOB_ID --limit 5
# Get run output
databricks runs get-output --run-id $RUN_ID
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Bundle validation fails | Invalid YAML | Run databricks bundle validate locally |
| Permission denied | Missing workspace access | Check service principal permissions |
| Cluster start fails | Quota exceeded | Request quota increase or use smaller nodes |
| Job timeout | Long-running task | Set timeout_seconds in job config |
Examples
Basic usage: Apply databricks deploy integration to a standard project setup with default configuration options.
Advanced scenario: Customize databricks deploy integration for production environments with multiple constraints and team-specific requirements.
Resources
Next Steps
For multi-environment setup, see databricks-multi-env-setup.
Output
- Configuration files or code changes applied to the project
- Validation report confirming correct implementation
- Summary of changes made and their rationale
Weekly Installs
19
Repository
jeremylongshore…s-skillsGitHub Stars
1.6K
First Seen
Feb 4, 2026
Security Audits
Installed on
codex19
opencode18
gemini-cli18
github-copilot18
amp18
claude-code17