ci-cd
SKILL.md
CI/CD in Apache Beam
Overview
Apache Beam uses GitHub Actions for CI/CD. Workflows are located in .github/workflows/.
Workflow Types
PreCommit Workflows
- Run on PRs and merges
- Validate code changes before merge
- Naming:
beam_PreCommit_*.yml
PostCommit Workflows
- Run after merge and on schedule
- More comprehensive testing
- Naming:
beam_PostCommit_*.yml
Scheduled Workflows
- Run nightly on master
- Check for external dependency impacts
- Tag master with
nightly-master
Key Workflows
PreCommit
| Workflow | Description |
|---|---|
beam_PreCommit_Java.yml |
Java build and tests |
beam_PreCommit_Python.yml |
Python tests |
beam_PreCommit_Go.yml |
Go tests |
beam_PreCommit_RAT.yml |
License header checks |
beam_PreCommit_Spotless.yml |
Code formatting |
PostCommit - Java
| Workflow | Description |
|---|---|
beam_PostCommit_Java.yml |
Full Java test suite |
beam_PostCommit_Java_ValidatesRunner_*.yml |
Runner validation tests |
beam_PostCommit_Java_Examples_*.yml |
Example pipeline tests |
PostCommit - Python
| Workflow | Description |
|---|---|
beam_PostCommit_Python.yml |
Full Python test suite |
beam_PostCommit_Python_ValidatesRunner_*.yml |
Runner validation |
beam_PostCommit_Python_Examples_*.yml |
Examples |
Load & Performance Tests
| Workflow | Description |
|---|---|
beam_LoadTests_*.yml |
Load testing |
beam_PerformanceTests_*.yml |
I/O performance |
Triggering Tests
Automatic
- PRs trigger PreCommit tests
- Merges trigger PostCommit tests
Triggering Specific Workflows
Use trigger files to run specific workflows.
Workflow Dispatch
Most workflows support manual triggering via GitHub UI.
Understanding Test Results
Finding Logs
- Go to PR → Checks tab
- Click on failed workflow
- Expand failed job
- View step logs
Common Failure Patterns
Flaky Tests
- Random failures unrelated to change
- Solution: Use trigger files to re-run the specific workflow.
Timeout
- Increase timeout in workflow if justified
- Or optimize test
Resource Exhaustion
- GCP quota issues
- Check project settings
GCP Credentials
Workflows requiring GCP access use these secrets:
GCP_PROJECT_ID- Project ID (e.g.,apache-beam-testing)GCP_REGION- Region (e.g.,us-central1)GCP_TESTING_BUCKET- Temp storage bucketGCP_PYTHON_WHEELS_BUCKET- Python wheels bucketGCP_SA_EMAIL- Service account emailGCP_SA_KEY- Base64-encoded service account key
Required IAM roles:
- Storage Admin
- Dataflow Admin
- Artifact Registry Writer
- BigQuery Data Editor
- Service Account User
Self-hosted vs GitHub-hosted Runners
Self-hosted (majority of workflows)
- Pre-configured with dependencies
- GCP credentials pre-configured
- Naming:
beam_*.yml
GitHub-hosted
- Used for cross-platform testing (Linux, macOS, Windows)
- May need explicit credential setup
Workflow Structure
name: Workflow Name
on:
push:
branches: [master]
pull_request:
branches: [master]
schedule:
- cron: '0 0 * * *'
workflow_dispatch:
jobs:
build:
runs-on: [self-hosted, ...]
steps:
- uses: actions/checkout@v4
- name: Run Gradle
run: ./gradlew :task:name
Local Debugging
Run Same Commands as CI
Check workflow file's run commands:
./gradlew :sdks:java:core:test
./gradlew :sdks:python:test
Common Issues
- Clean gradle cache:
rm -rf ~/.gradle .gradle - Remove build directory:
rm -rf build - Check Java version matches CI
Snapshot Builds
Locations
- Java SDK: https://repository.apache.org/content/groups/snapshots/org/apache/beam/
- SDK Containers: https://gcr.io/apache-beam-testing/beam-sdk
- Portable Runners: https://gcr.io/apache-beam-testing/beam_portability
- Python SDK: gs://beam-python-nightly-snapshots
Release Workflows
| Workflow | Purpose |
|---|---|
cut_release_branch.yml |
Create release branch |
build_release_candidate.yml |
Build RC |
finalize_release.yml |
Finalize release |
publish_github_release_notes.yml |
Publish notes |
Weekly Installs
18
Repository
apache/beamGitHub Stars
8.5K
First Seen
14 days ago
Security Audits
Installed on
gemini-cli18
opencode18
codebuddy18
github-copilot18
codex18
kimi-cli18