Keboola Configuration
Keboola Configuration Knowledge
Provide expertise on Keboola project structure, configuration formats, and best practices for managing data pipelines.
Project Structure
A Keboola project pulled locally has this structure:
project-root/
├── .keboola/
│ └── manifest.json # Project metadata and branch info
├── .env.local # API token (never commit)
├── .env.dist # Template for .env.local
├── .gitignore
└── [branch-name]/ # One directory per branch
└── [component-id]/ # e.g., keboola.snowflake-transformation
└── [config-name]/ # Configuration directory
├── config.json # Main configuration
├── meta.json # Metadata (name, description)
└── rows/ # Configuration rows (if applicable)
Configuration Files
manifest.json
Located in .keboola/manifest.json, contains:
- Project ID and API host
- Branch information
- Sorting and naming conventions
config.json
The main configuration file for each component. Structure varies by component type but typically includes:
parameters- Component-specific settingsstorage- Input/output table mappingsprocessors- Pre/post processing steps
meta.json
Metadata about the configuration:
{
"name": "Configuration Name",
"description": "What this configuration does",
"isDisabled": false
}
Component Types
Transformations
SQL or Python/R transformations for data processing.
Snowflake Transformation (keboola.snowflake-transformation):
{
"parameters": {
"blocks": [
{
"name": "Block Name",
"codes": [
{
"name": "Script Name",
"script": ["SELECT * FROM table"]
}
]
}
]
},
"storage": {
"input": {
"tables": [
{
"source": "in.c-bucket.table",
"destination": "table"
}
]
},
"output": {
"tables": [
{
"source": "output_table",
"destination": "out.c-bucket.result"
}
]
}
}
}
Extractors
Components that pull data from external sources (databases, APIs, files).
Common extractors:
keboola.ex-db-snowflake- Snowflake extractorkeboola.ex-google-analytics-v4- Google Analyticskeboola.ex-generic-v2- Generic HTTP API extractor
Writers
Components that push data to external destinations.
Common writers:
keboola.wr-db-snowflake- Snowflake writerkeboola.wr-google-sheets- Google Sheets writer
Orchestrations
Workflow definitions that run multiple configurations in sequence.
Located in keboola.orchestrator/ with:
- Task definitions
- Dependencies
- Scheduling
Best Practices
When Editing Configurations
- Always run
kbc diffbefore and after changes - Validate JSON syntax before pushing
- Use
kbc validateto check configuration validity - Keep descriptions updated in meta.json
Storage Mappings
- Input tables: Map source tables to working names
- Output tables: Map result tables to destination buckets
- Use consistent naming conventions
Transformations
- Break complex logic into multiple blocks
- Use meaningful names for blocks and scripts
- Document SQL with comments
- Test locally when possible
Common Tasks
Add a New Input Table to Transformation
In config.json, add to storage.input.tables:
{
"source": "in.c-bucket.new_table",
"destination": "new_table",
"columns": [] // Empty = all columns
}
Add Output Table
In config.json, add to storage.output.tables:
{
"source": "result_table",
"destination": "out.c-bucket.result",
"primary_key": ["id"]
}
Modify SQL Script
Edit the script array in the relevant block/code section. Each array element is a SQL statement.
Troubleshooting
Invalid Configuration
- Check JSON syntax (missing commas, brackets)
- Verify table names exist in storage
- Check column names in mappings
Push Conflicts
- Pull latest changes first
- Merge conflicts manually in config files
- Push again after resolution
Missing Tables
- Ensure input tables exist in Keboola Storage
- Check bucket permissions
- Verify table names match exactly (case-sensitive)
More from keboola/ai-kit
dataapp-dev
Expert for developing Streamlit data apps for Keboola deployment. Activates when building, modifying, or debugging Keboola data apps, Streamlit dashboards, adding filters, creating pages, or fixing data app issues. Validates data structures using Keboola MCP before writing code, tests implementations with Playwright browser automation, and follows SQL-first architecture patterns.
25get-started
>
21migrate-to-uv
Migrate Keboola Python components to modern uv build system with deterministic dependencies and ruff linting.
13test-component
>
13debug-component
Expert skill for debugging Keboola Python components. Use when a component is failing,
12dataapp-deployment
Use when deploying any web app to Keboola Data Apps, setting up keboola-config directory, configuring Nginx/Supervisord for Docker, handling SSE or WebSocket streaming through Nginx, mapping secrets to environment variables, or debugging Keboola Data App deployment issues like POST to root errors, 500s from missing env vars, or buffered streams.
11