cicd-pipeline-design
CI/CD Pipeline Design
Acknowledgement: Shared by Peter Bamuhigire, techguypeter.com, +256 784 464178.
Use When
- Use when designing or reviewing production CI/CD pipelines, deployment pipelines, artifact promotion, branch strategy, release controls, rollback paths, and delivery-system metrics for any language or platform.
- The task needs reusable judgment, domain constraints, or a proven workflow rather than ad hoc advice.
Do Not Use When
- The task is unrelated to
cicd-pipeline-designor would be better handled by a more specific companion skill. - The request only needs a trivial answer and none of this skill's constraints or references materially help.
Required Inputs
- Gather relevant project context, constraints, and the concrete problem to solve; load
referencesonly as needed. - Confirm the desired deliverable: design, code, review, migration plan, audit, or documentation.
Workflow
- Read this
SKILL.mdfirst, then load only the referenced deep-dive files that are necessary for the task. - Apply the ordered guidance, checklists, and decision rules in this skill instead of cherry-picking isolated snippets.
- Produce the deliverable with assumptions, risks, and follow-up work made explicit when they matter.
Quality Standards
- Keep outputs execution-oriented, concise, and aligned with the repository's baseline engineering standards.
- Preserve compatibility with existing project conventions unless the skill explicitly requires a stronger standard.
- Prefer deterministic, reviewable steps over vague advice or tool-specific magic.
Anti-Patterns
- Treating examples as copy-paste truth without checking fit, constraints, or failure modes.
- Loading every reference file by default instead of using progressive disclosure.
Outputs
- A concrete result that fits the task: implementation guidance, review findings, architecture decisions, templates, or generated artifacts.
- Clear assumptions, tradeoffs, or unresolved gaps when the task cannot be completed from available context alone.
- References used, companion skills, or follow-up actions when they materially improve execution.
Evidence Produced
| Category | Artifact | Format | Example |
|---|---|---|---|
| Release evidence | Pipeline design decision record | Markdown doc per skill-composition-standards/references/adr-template.md covering artifact-promotion, branch-strategy, and gate picks |
docs/ci/pipeline-design-adr.md |
References
- Use the
references/directory for deep detail after reading the core workflow below.
Use this skill when the pipeline must function as a trusted delivery system. The goal is not merely automation. The goal is to keep changes small, verifiable, promotable, observable, and reversible.
Load Order
- Load
world-class-engineering. - Load
deployment-release-engineeringfor rollout and rollback design. - Load this skill to define the pipeline, artifact flow, and branch strategy.
- Pair it with
advanced-testing-strategy,observability-monitoring, andcicd-devsecops.
Core Principles
- Pipeline is the only normal route to production.
- Build once and promote the same artifact through environments.
- Keep cheap checks early and deeper checks later, but do not remove meaningful risk checks.
- Treat broken default-branch pipelines as urgent delivery defects.
- Track throughput and stability together through DORA-style metrics.
Executable Outputs
For non-trivial pipeline work, produce:
- stage map with entry and exit criteria
- artifact promotion model
- branch and release-control model
- migration and rollback steps
- telemetry, alert, and release-marker plan
- pipeline bottlenecks and remediation priorities
Pipeline Workflow
1. Define the Delivery Path
Capture:
- source control and branch strategy
- commit stage checks
- deeper validation stages
- artifact repository and promotion flow
- rollout path to each environment
- rollback or feature-disable path
2. Build a Canonical Stage Model
Use a shape like:
- checkout and dependency restore
- build and package
- unit tests, lint, and static checks
- security and dependency checks
- artifact publish
- deploy to lower environment
- smoke, integration, contract, and acceptance checks as risk requires
- performance or resilience checks where justified
- production rollout and observation window
3. Design Artifact Promotion
- Promote the same artifact through environments.
- Keep environment differences in configuration and secrets, not rebuilt binaries.
- Version artifacts so release candidates, production builds, and ephemeral snapshots are distinguishable.
- Keep provenance and release notes attached to the promoted artifact.
4. Choose Branch and Release Controls
- Prefer trunk-based development or similarly short-lived branches.
- Use feature flags, dark launches, or canaries when deployment can complete before exposure should.
- Keep
mainor the releasable branch deployable. - Require status checks and review on the branch that feeds production.
5. Design Migration and Rollback Safety
- Put schema and data changes into the pipeline explicitly.
- Use expand-contract for overlapping-version support on live systems.
- Define migration verification queries and rollback posture before release.
- Classify data changes as reversible, compensating-only, or forward-fix-only.
6. Observe and Improve the Pipeline
- Emit release markers so telemetry can answer what changed recently.
- Measure pipeline duration, red-time, flaky stages, and rerun frequency.
- Remove stale or low-signal stages that erode trust.
- Treat pipeline pain as engineering debt with owners and follow-up dates.
Standards
Commit Stage
- Fast enough to run on every normal integration.
- Strong enough to reject obviously unsafe changes.
- Clear failure messages with links to evidence when possible.
Promotion
- Artifact immutability is preferred.
- Rebuild drift between staging and production is not acceptable.
- Pipeline definition should live in version control.
Branching
- Long-lived branches are a warning sign, not a default.
- Protect trunk quality with review, tests, and rapid repair of failures.
- Do not hide poor release slicing behind long integration delay.
Release Evidence
- Keep records of what passed, what was skipped, and what remains unproven.
- Attach migration notes, rollback notes, and post-deploy watch lists to risky releases.
- Make observation ownership explicit for production rollout windows.
Review Checklist
- The pipeline is the normal route to production.
- The same artifact is promoted through environments.
- Branch strategy keeps integration delay low.
- Stage purpose, owner, and failure evidence are explicit.
- Migration and rollback steps are part of the pipeline, not side notes.
- Release markers and telemetry support post-deploy diagnosis.
- Pipeline bottlenecks and flaky stages have remediation owners.
FinOps & Cost Governance
Resource Tagging Strategy
Mandatory tag keys that every cloud resource must carry:
Environment—dev/staging/productionTeam— owning team name (e.g.,payments,platform)CostCenter— finance code for chargebackProject— logical project or product nameOwner— primary on-call email
Enforce via Terraform default_tags at provider level:
provider "aws" {
region = "eu-west-1"
default_tags {
tags = {
Environment = var.environment
Team = var.team
CostCenter = var.cost_center
Project = var.project
Owner = var.owner_email
ManagedBy = "terraform"
}
}
}
Enforce at admission time via AWS Config Rule required-tags or a Gatekeeper Constraint on K8s.
AWS Cost Explorer & Budgets
- Cost Explorer: enable granularity by
DAILY, group byTAG:TeamandTAG:Environment - Create a monthly budget via AWS CLI:
aws budgets create-budget \
--account-id 123456789012 \
--budget '{
"BudgetName": "monthly-cap",
"BudgetLimit": {"Amount": "500", "Unit": "USD"},
"TimeUnit": "MONTHLY",
"BudgetType": "COST"
}' \
--notifications-with-subscribers '[
{
"Notification": {
"NotificationType": "FORECASTED",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 80,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [{"SubscriptionType": "EMAIL", "Address": "finance@example.com"}]
}
]'
- Anomaly detection:
AWS Cost Anomaly Detection— subscribe to a SNS topic forDAILYanomaly alerts above threshold.
Kubernetes Resource Quotas
Per-namespace ResourceQuota:
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: payments
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
persistentvolumeclaims: "10"
services.loadbalancers: "2"
Cost per namespace tracking via Kubecost or OpenCost; correlate with CostExplorer via the kubernetes.io/cluster/<name> tag.
Utilisation Targets
- CPU ≥ 70% average, memory ≥ 60% average across 30-day window
- Under-utilised = < 40% average CPU for 7+ days → right-size or consolidate
- Use AWS Compute Optimizer for EC2/RDS recommendations
- For K8s: VPA recommendations + HPA targets tuned to keep average between 60–80%
Spot Instance Strategy
- CI runners: use
github-actions-runner-controlleron K8s with Spot node pool,karpenterprovisioner filtering on Spot interruption rate - Batch jobs / ML training:
aws ec2 run-instanceswithInstanceMarketOptions.MarketType=spotandMaxPriceat 50% of on-demand - Production web: only on Spot with multi-AZ / multi-instance-type diversification and graceful interruption handling (draining on
spot-instance-interruption-warning) - Never Spot for: stateful databases (RDS, Elasticsearch primary), single-replica critical services
FinOps Maturity Model
Three stages — crawl / walk / run:
- Crawl: monthly manual cost review meeting, tags enforced on new resources, budget alerts wired to Slack. Expected savings: 10–15%.
- Walk: automated anomaly detection, dedicated cost owner per team, reserved instance strategy for steady-state workloads, right-sizing recommendations reviewed monthly. Expected savings: 20–30%.
- Run: chargeback to teams in internal accounting, cost impact reviewed per PR for significant infra changes, automated Spot/Reserved mix optimisation, unit economics dashboard ($/user, $/transaction). Expected savings: 30–45% plus improved decision quality.
Choose the stage that matches current operational maturity; do not skip stages.
References
- references/pipeline-governance.md: Pipeline trust, evidence, and stop-the-line response.
- ../deployment-release-engineering/references/deployment-pipeline.md: Canonical release stage model and release packet.
- ../world-class-engineering/references/source-patterns.md: CI/CD and DevOps patterns derived from the supplied books.
More from peterbamuhigire/skills-web-dev
google-play-store-review
Google Play Store compliance and review readiness for Android apps. Use
76multi-tenant-saas-architecture
Use when designing or reviewing a multi-tenant SaaS platform — tenant
62jetpack-compose-ui
Jetpack Compose UI standards for beautiful, sleek, minimalistic Android
49gis-mapping
Use for web apps that need Leaflet-first GIS mapping, location selection,
48saas-accounting-system
Implement a complete double-entry accounting system inside any SaaS app.
47manual-guide
Generate end-user manuals and reference guides for ERP modules. Use when
38