gitops-cd-developer
GitOps CD Developer
CD half of the GitOps loop: a root Application spawns child Applications, each of which renders a shareable Helm chart with per-environment values and a chosen rollout strategy. ArgoCD sees the diff, syncs, self-heals, prunes — no kubectl apply from a CI pipeline.
When to use
- The user wants to set up ArgoCD / Flux for the first time.
- The user wants to design an App-of-Apps tree so one root manages many envs / instances / tenants.
- The user wants a sharable Helm chart their other repos can consume (multi-runtime, multi-env).
- The user wants to choose / wire a rollout strategy: RollingUpdate, Recreate, blue-green, canary.
- The user wants
values-dev.yaml/values-staging.yaml/values-prod.yamllaid out cleanly. - A task chain ends in "and ArgoCD picks it up automatically".
Pairs with gitops-pipeline-developer (the CI half — produces images and updates values.yaml). Together: CI pushes the image + tag + digest to GitOps; CD reconciles cluster state from the GitOps repo.
The release loop in one paragraph
CI produces a signed image and writes the new tag + digest into a values.yaml in a GitOps repo. The root ArgoCD Application in the management cluster watches a folder of child Application templates; each child points at the GitOps repo and renders one or more workloads via a shared Helm chart. ArgoCD's automated sync sees the values-file change, performs helm template, diffs against the live cluster, applies changes with ServerSideApply, and reports Healthy / Synced. Self-heal undoes drift; prune removes resources no longer in git. The cluster's state is, by definition, what's in git.
Required structure
A typical GitOps CD layout has three repos (or three trees in one monorepo):
gitops/
├── app-of-apps/ # Root + per-env child Applications
│ ├── projects.yaml # AppProject (security boundary)
│ ├── root-application.yaml # The root Application (the one human applies once)
│ ├── envs/
│ │ ├── dev/
│ │ │ ├── Chart.yaml # Helm chart that renders ArgoCD Applications
│ │ │ ├── values.yaml # which apps to spawn in dev
│ │ │ └── templates/
│ │ │ └── apps.yaml # one Application per app
│ │ ├── staging/
│ │ └── prod/
│ └── README.md
├── helm-chart-template/ # The shareable chart (published to OCI registry)
│ └── helm-chart-template/
│ ├── Chart.yaml
│ ├── values.yaml # full schema with sane defaults
│ ├── templates/
│ └── README.md
└── consumer-chart/ # Per-app chart that depends on the template
├── Chart.yaml # dependency: helm-chart-template
├── values-dev.yaml # tag, digest, env-specific overrides
├── values-staging.yaml
└── values-prod.yaml
Don't invent a parallel layout if the team already has one — extend it.
Workflow
- Discover the existing setup. Check for existing ArgoCD / Flux installs, repo layout, registered AppProjects, current Helm charts, current
values-*.yamlshapes. The new design extends what's there. - Pick the controller. Default ArgoCD. Flux 2 is the alternative — same mental model, different CRDs (
Kustomization+HelmReleaseinstead ofApplication). Don't migrate one to the other inside this task. - Design the AppProject(s). Each project is a security boundary: which repos can be sources, which namespaces / clusters are valid destinations, which resource kinds are allowed. Load
references/projects-rbac.mdfor the catalog and the per-team / per-tenant patterns. Useassets/templates/argocd/appproject.yaml.template. - Write the root Application. A single
Applicationthat points at theapp-of-apps/envs/<env>/chart, withautomated.prune: true+automated.selfHeal: true+syncOptions: [CreateNamespace=true, PrunePropagationPolicy=foreground, ServerSideApply=true]+ a retry policy. Apply it manually once; from then on it manages itself. Useassets/templates/argocd/root-application.yaml.template. - Spawn child Applications. Each
envs/<env>/templates/apps.yamlrenders one or moreApplicationmanifests — one per workload. They live under the AppProject from step 3 and point at the consumer chart with the rightvalues-<env>.yaml. Useassets/templates/argocd/child-application.yaml.template. Seereferences/app-of-apps.mdfor the parent → children → grandchildren depth trade-offs. - Build the shareable Helm chart. Copy
assets/templates/chart/into your template repo; renamechart-templateto your team's name; reviewvalues.yaml— the schema covers deployment + service + ingress + HPA + probes + init containers + configmap + secret + service account + pod security context + node-pool / tolerations + extra ports. Multi-runtime: Node.js / PHP / Java work out of the box because the values schema is runtime-agnostic. Loadreferences/helm-chart-design.mdfor the design principles. - Pick a rollout strategy. Defaults:
RollingUpdate(maxSurge: 1, maxUnavailable: 0) for stateless apps;Recreatefor single-instance stateful apps;Argo Rolloutsfor canary or blue-green when you need traffic-shaped progressive delivery. Loadreferences/deployment-strategies.mdfor which to pick when, plus the Argo Rollouts wiring. - Lay out per-environment values.
values.yamlships sane defaults;values-dev.yaml/values-staging.yaml/values-prod.yamloverride per env. Image tag + digest land in the env-specific file (the CI half writes them). Multi-instance / multi-tenant via ainstances:map at the root that the chart'srange $name, $cfg := .Values.instanceswalks. Loadreferences/multi-environment.mdfor the layout choices. - Configure sync policy. Per Application:
automated: { prune: true, selfHeal: true, allowEmpty: true },syncOptions: [CreateNamespace=true, PrunePropagationPolicy=foreground, PruneLast=true, ServerSideApply=true, Replace=true],retry: { limit: 5, backoff: { duration: 5s, factor: 2, maxDuration: 3m } }. Loadreferences/sync-policies.mdto understand each toggle and when to flip it. - Package + publish the chart to an OCI registry.
helm package→helm push oci://<registry>/<path>. Usescripts/package-and-push.sh— pulls credentials from a yq-readable secrets file, signs the chart with cosign if configured, idempotent. The CI half installs the chart by referencingoci://...@<digest>from consumer charts.
Available resources
assets/templates/argocd/root-application.yaml.template— the one Application a human applies once.assets/templates/argocd/child-application.yaml.template— the per-app spawn pattern.assets/templates/argocd/appproject.yaml.template— the security boundary.assets/templates/argocd/values-env.yaml.template— values shape for spawning N apps in one env.assets/templates/chart/— complete starter chart (Chart.yaml, values.yaml, templates/{deployment,service,ingress,hpa,configmap,_helpers.tpl,NOTES.txt}).assets/examples/sample-app/— fully-worked tiny example (root + dev + prod children + consumer chart).scripts/package-and-push.sh— OCI chart packaging + push + cosign sign. Idempotent.references/app-of-apps.md— parent → children → grandchildren depth, sync waves, scaling beyond ~50 apps.references/helm-chart-design.md— schema design, multi-runtime, library vs application charts, value-file inheritance, pinning.references/deployment-strategies.md— RollingUpdate vs Recreate vs blue-green vs canary, Argo Rollouts wiring, traffic-shaping, abort criteria.references/multi-environment.md—values-<env>.yamllayout, instance maps, region axes,helm template --debugper-env testing.references/sync-policies.md— every ArgoCD sync toggle, when to use each, retry tuning, ServerSideApply vs ClientSide.references/projects-rbac.md— AppProject as a security boundary, per-tenant / per-team patterns, ArgoCD RBAC.
Top gotchas (always inline — do not skip)
- Apply the root Application once, manually. From then on it manages itself, including its own updates. Never
kubectl applychild Applications by hand — the root is the source of truth; manual applies create drift the root will undo. - Pin chart dependencies to a version.
dependencies: [{ name: chart-template, version: 0.4.2, repository: "oci://..." }]— never floating ranges. A floating dep means "production silently changes shape on a chart-only release". - Production deploys pin image to digest, not tag. The CI half writes both
image.tag(for humans) andimage.digest(for kubelet) intovalues-<env>.yaml. Templates should renderrepo@digestwhen digest is present. automated.selfHeal: trueis non-negotiable for shared envs. It's what undoes "I'll just kubectl edit this real quick" drift before it compounds. Without it, the cluster slowly diverges from git.syncOptions: ServerSideApply=true. Client-side apply mangles fields managed by other controllers (HPA'sreplicas, kube-controller-manager's defaults, ingress controllers). Server-side respects field ownership. Default to true unless you have a documented reason.allowEmpty: trueis a footgun on the root Application. It prevents accidental "I deleted all the apps" autosync. On child Applications it's fine — leave it on. On the root default it false unless the root really might legitimately render zero children.- Sync waves go forward only. A child with
argocd.argoproj.io/sync-wave: "1"syncs after wave 0; setting wave -1 doesn't run before the parent's existing wave 0 — it just runs in numeric order within the same Application. Use waves to stage CRDs → operators → workloads, not as cross-application ordering (that's what App-of-Apps levels are for). - Don't put secrets in the GitOps repo unencrypted. SealedSecrets / SOPS / External Secrets Operator / Vault Secrets Operator — pick one. The chart consumes Secrets by
valueFrom.secretKeyRef, never inlines the value. Seereferences/projects-rbac.mdfor the four common shapes. prune: truedeletes things you remove from git. Test it on dev first. A common surprise: removing an Application fromenvs/<env>/templates/apps.yamlactually deletes all of its workloads. Sometimes that's what you want; sometimes it isn't. Useargocd app sync --dry-runto preview a prune.- Per-env values files override defaults; they don't merge across envs.
values.yaml+values-prod.yamlis the full composition for prod —values-staging.yamlis not layered into prod. If a setting must be the same across envs, put it invalues.yaml. If it must differ, override per env.
What you DO
- Discover existing ArgoCD / Flux state and existing repo layout before designing.
- Write one root Application that spawns all children for one environment, applied manually once.
- Use AppProjects as security boundaries — no project means "argocd allows everything", which is wrong.
- Build a shareable Helm chart with a runtime-agnostic schema (probes, ports, env vars, image, HPA), publish it to an OCI registry, version it semver, depend on it from consumer charts pinned to a version.
- Choose a rollout strategy by stateful vs stateless + tolerance for downtime: RollingUpdate (default) → Recreate (stateful single-instance) → Argo Rollouts canary (need traffic shaping).
- Pin
image.repository@digestin productionvalues-prod.yaml. Tag is human-readable; digest is what kubelet pulls. - Configure every Application with
automated: { prune, selfHeal, allowEmpty }+ServerSideApply+ a retry policy. - Keep secrets out of git (SealedSecrets / SOPS / ESO / VSO); chart consumes via
secretKeyRef. - Lay values out as
values.yaml(defaults) +values-<env>.yaml(env overrides). Document the diff in the chart README. - Package + push the chart to OCI with
scripts/package-and-push.sh; sign the chart artifact with cosign.
What you do NOT do
- Apply child Applications by hand. The root manages them.
- Pin chart dependencies to a floating range (
^0.4,~1.2). - Pin production deploys to a tag (
:latest,:1.5.0) instead of a digest. - Disable
selfHealon shared envs to "make troubleshooting easier" — drift accumulates fast. - Use ClientSideApply on workloads with HPAs / external controllers — owned-field collisions break apply.
- Inline secrets into the GitOps repo, even encrypted-at-rest. Use a sealed-secret / external-secret integration.
- Treat
prune: truecasually. Removing a child Application file deletes its workloads. - Migrate ArgoCD ↔ Flux as part of this task. That's a separate, multi-PR project.
- Skip the AppProject. "Default project allows everything" is the worst posture.
- Use sync waves to order across Applications; that's what the App-of-Apps tree levels are for.
More from mkabumattar/skills
linux-script-developer
Write production-ready Bash scripts with strict error handling (`set -euo pipefail`), validated argument parsing, colored user feedback, and cross-platform compatibility (Linux, macOS, Windows via Git Bash/WSL). Use this skill whenever the user asks for a `.sh` script, a shell script, a Bash one-liner installer, a deployment script, an automation/CI script, a CLI wrapper, or a file-batch processor — including casual phrasings like "write a script to ...", "automate this in bash", or "make me a shell tool". Also use when reviewing or hardening an existing Bash script.
16skill-builder
Build a new Agent Skill that follows the agentskills.io specification and best practices — slim `SKILL.md` (≤ 200 lines / 5K tokens), valid kebab-case `name`, imperative `description` under 1024 chars, progressive disclosure via `references/`, bundled `assets/` and `scripts/`, and an MIT `LICENSE`. Use this skill whenever the user asks to create, scaffold, build, write, or author a new Agent Skill — including phrasings like "build a skill for X", "scaffold a new skill", "create an agent skill", "make me a skill that does X", "write a SKILL.md for ...", or "I want to publish a skill on agentskills.io". Also use when reviewing or refactoring an existing oversized `SKILL.md` (a sign that detail should be moved to `references/`).
15python-script-developer
Write production-ready Python CLI tools, automation scripts, and batch file processors with type hints, structured `logging` (never `print` for diagnostics), `argparse` interfaces, `pathlib` for filesystem work, specific exception handling, and cross-platform support (Linux, macOS, Windows). Use this skill whenever the user asks to create a Python script, `.py` utility, CLI tool, automation, batch processor, or data pipeline — including casual phrasings like "write a python script that ...", "automate this in python", "I need a small tool", or "give me a one-off processor". Also use when reviewing or hardening an existing Python script.
15information-architecture
Plan the structural and execution architecture of a feature, app, or site — produce both an `INFORMATION_ARCHITECTURE.md` (site map, navigation, content hierarchy, user flows, URL strategy, naming conventions, component reuse map) AND a phased `PLAN.md` (phases by impact/effort/risk, vertical-slice tasks with sub-tasks, dependencies, estimates, and a detailed task breakdown with Why/How/Impact/Effort). Use this skill whenever the user wants to plan a product or feature, design site structure, lay out information architecture, map user flows, organize content, break work into phases, build a roadmap, plan an implementation order, or hits you with phrases like "plan the IA", "map the structure", "break this into tasks", "give me a roadmap", "phase out the work", "create an enhancement plan", or "what should I build first". Also use when reviewing or refactoring an existing IA or project plan.
15makefile-script-developer
Write production-ready GNU Makefiles with strict shell mode (`SHELL := /bin/bash` + `.SHELLFLAGS := -euo pipefail -c`), validated multi-environment configuration via `$(filter ...)`, pre-flight check targets (`check-tools`, `check-env`), structured logging with timestamps, confirmation gates for destructive ops, layered `.env` includes, platform detection, and self-documenting help. Use this skill whenever the user asks to write a Makefile, harden an existing one, add a target, build a deploy/release pipeline, automate Terraform/Helm/Kubernetes/Docker/build workflows, or expose tasks as `make <target>` — including casual phrasings like "write a Makefile", "add a make target", "automate this in make", "give me a build pipeline", or "clean up the Makefile". Also use when reviewing an existing Makefile for safety, error handling, or organization issues.
15qa
Run an interactive QA session — the user describes bugs and issues conversationally, you ask brief clarifying questions, explore the codebase for domain context, decide whether to file one issue or break it down, and create durable user-focused GitHub issues via `gh issue create` — without referencing internal file paths or line numbers. Use this skill whenever the user wants to do QA, report bugs, file issues, walk through a list of problems, or hits you with phrases like "let's do a QA session", "I found a bug", "this is broken", "file this as an issue", "I have a few things to report", or "let's go through these one by one". Also use when the user is reviewing a deployed feature and wants to track defects.
14