skills/laurigates/claude-plugins/helm-release-recovery

helm-release-recovery

SKILL.md

Helm Release Recovery

Comprehensive guidance for recovering from failed Helm deployments, rolling back releases, and managing stuck or corrupted release states.

When to Use

Use this skill automatically when:

  • User needs to rollback a failed or problematic deployment
  • User reports stuck releases (pending-install, pending-upgrade)
  • User mentions failed upgrades or partial deployments
  • User needs to recover from corrupted release state
  • User wants to view release history
  • User needs to clean up failed releases

Core Recovery Operations

Rollback to Previous Revision

# Rollback to previous revision (most recent successful)
helm rollback <release> --namespace <namespace>

# Rollback to specific revision number
helm rollback <release> 3 --namespace <namespace>

# Rollback with wait and atomic behavior
helm rollback <release> \
  --namespace <namespace> \
  --wait \
  --timeout 5m \
  --cleanup-on-fail

# Rollback without waiting (faster but less safe)
helm rollback <release> \
  --namespace <namespace> \
  --no-hooks

Key Flags:

  • --wait - Wait for resources to be ready
  • --timeout - Maximum time to wait (default 5m)
  • --cleanup-on-fail - Delete new resources on failed rollback
  • --no-hooks - Skip running rollback hooks
  • --force - Force resource updates through deletion/recreation
  • --recreate-pods - Perform pods restart for the resource if applicable

View Release History

# View all revisions
helm history <release> --namespace <namespace>

# View detailed history (YAML format)
helm history <release> \
  --namespace <namespace> \
  --output yaml

# Limit number of revisions shown
helm history <release> \
  --namespace <namespace> \
  --max 10

History Output Fields:

  • REVISION: Sequential version number
  • UPDATED: Timestamp of deployment
  • STATUS: deployed, superseded, failed, pending-install, pending-upgrade
  • CHART: Chart name and version
  • APP VERSION: Application version
  • DESCRIPTION: What happened (Install complete, Upgrade complete, Rollback to X)

Check Release Status

# Check current release status
helm status <release> --namespace <namespace>

# Show deployed resources
helm status <release> \
  --namespace <namespace> \
  --show-resources

# Get status of specific revision
helm status <release> \
  --namespace <namespace> \
  --revision 5

Common Recovery Scenarios

Scenario 1: Recent Deploy Failed - Simple Rollback

Symptoms:

  • Recent upgrade/install failed
  • Application not working after deployment
  • Want to restore previous working version

Recovery Steps:

# 1. Check release status
helm status myapp --namespace production

# 2. View history to identify good revision
helm history myapp --namespace production

# Output example:
# REVISION  STATUS      CHART        DESCRIPTION
# 1         superseded  myapp-1.0.0  Install complete
# 2         superseded  myapp-1.1.0  Upgrade complete
# 3         deployed    myapp-1.2.0  Upgrade "myapp" failed

# 3. Rollback to previous working revision (2)
helm rollback myapp 2 \
  --namespace production \
  --wait \
  --timeout 5m

# 4. Verify rollback
helm history myapp --namespace production

# 5. Verify application health
kubectl get pods -n production -l app.kubernetes.io/instance=myapp
helm status myapp --namespace production

Scenario 2: Stuck Release (pending-install/pending-upgrade)

Symptoms:

helm list -n production
# NAME   STATUS          CHART
# myapp  pending-upgrade myapp-1.0.0

Recovery Steps:

# 1. Check what's actually deployed
kubectl get all -n production -l app.kubernetes.io/instance=myapp

# 2. Check release history
helm history myapp --namespace production

# Option A: Rollback to previous working revision
helm rollback myapp <previous-working-revision> \
  --namespace production \
  --wait

# Option B: Force new upgrade to unstick
helm upgrade myapp ./chart \
  --namespace production \
  --force \
  --wait \
  --atomic

# Option C: If rollback fails, delete and reinstall
# WARNING: This will cause downtime
helm uninstall myapp --namespace production --keep-history
helm install myapp ./chart --namespace production --atomic

# 3. Verify recovery
helm status myapp --namespace production

For additional recovery scenarios (partial deployments, corrupted history, failed rollbacks, cascading failures), history management, atomic deployment patterns, recovery best practices, troubleshooting, and CI/CD integration, see REFERENCE.md.

Agentic Optimizations

Context Command
Release history (JSON) helm history <release> -n <ns> --output json
Release status (JSON) helm status <release> -n <ns> -o json
Revision values (JSON) helm get values <release> -n <ns> --revision <N> -o json
Pod status (compact) kubectl get pods -n <ns> -l app.kubernetes.io/instance=<release> -o wide
Helm secrets (list) kubectl get secrets -n <ns> -l owner=helm,name=<release> -o json

Related Skills

  • Helm Release Management - Install, upgrade operations
  • Helm Debugging - Troubleshooting deployment failures
  • Helm Values Management - Managing configuration
  • Kubernetes Operations - Managing deployed resources

References

Weekly Installs
50
GitHub Stars
13
First Seen
Jan 29, 2026
Installed on
github-copilot49
opencode49
amp48
codex48
kimi-cli48
gemini-cli48