kubespray-certificates
Kubespray Certificate Management
Overview
Kubernetes uses TLS certificates for all internal communication. Kubespray-deployed clusters have certificates that expire after 1 year by default. Expired certificates cause immediate cluster failure.
Core principle: Enable auto-renewal during deployment. If certificates expire, manual renewal is required and time-sensitive.
When to Use
- Checking certificate expiration dates
- Enabling certificate auto-renewal
- Manually renewing expired certificates
- Troubleshooting x509 certificate errors
Not for: Initial deployment (use kubespray-deployment), cluster upgrades (use kubespray-operations)
Certificate Types
| Certificate | Location | Purpose |
|---|---|---|
| CA cert | /etc/kubernetes/pki/ca.crt |
Signs all other certs |
| API server | /etc/kubernetes/pki/apiserver.crt |
API server identity |
| kubelet client | /etc/kubernetes/pki/apiserver-kubelet-client.crt |
API → kubelet auth |
| etcd CA | /etc/ssl/etcd/ssl/ca.pem |
Signs etcd certs |
| etcd member | /etc/ssl/etcd/ssl/member-*.pem |
etcd peer auth |
Checking Expiration
Using kubeadm
kubeadm certs check-expiration
Output shows all certificates with expiration dates.
Using openssl
# Check specific certificate
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -dates
# Check etcd certificate
openssl x509 -in /etc/ssl/etcd/ssl/member-$(hostname).pem -noout -dates
# One-liner for all K8s PKI certs
for cert in /etc/kubernetes/pki/*.crt; do
echo "=== $cert ==="
openssl x509 -in "$cert" -noout -dates
done
Check Certificate Details
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text | grep -A2 Validity
Enabling Auto-Renewal (Recommended)
During Initial Deployment
Add to inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml:
auto_renew_certificates: true
auto_renew_certificates_systemd_calendar: "Mon *-*-1,15 03:00:00"
This creates a systemd timer that runs twice monthly.
Verify Timer is Active
systemctl list-timers | grep cert
# Shows k8s-cert-renew.timer
systemctl status k8s-cert-renew.timer
What Auto-Renewal Does
The timer runs:
kubeadm certs renew all
Then restarts control plane components via kubelet.
Manual Certificate Renewal
When Auto-Renewal is Not Configured
Run Kubespray's certificate renewal playbook:
cd kubespray
ansible-playbook -i inventory/mycluster/inventory.ini cluster.yml \
--tags=k8s-secrets,kubernetes/control-plane \
-b
Using kubeadm Directly
On each control plane node:
# Renew all certificates
kubeadm certs renew all
# Restart control plane (method depends on deployment)
# For static pods:
mv /etc/kubernetes/manifests/*.yaml /tmp/
sleep 10
mv /tmp/*.yaml /etc/kubernetes/manifests/
# Verify
kubectl get nodes
Renewing Specific Certificates
kubeadm certs renew apiserver
kubeadm certs renew apiserver-kubelet-client
kubeadm certs renew front-proxy-client
kubeadm certs renew admin.conf
etcd Certificate Renewal
etcd certificates are separate from Kubernetes certificates.
Check etcd Certificate Expiration
openssl x509 -in /etc/ssl/etcd/ssl/member-$(hostname).pem -noout -dates
openssl x509 -in /etc/ssl/etcd/ssl/admin-$(hostname).pem -noout -dates
Renew etcd Certificates via Kubespray
ansible-playbook -i inventory/mycluster/inventory.ini cluster.yml \
--tags=etcd \
-b
Troubleshooting Certificate Issues
Error: x509 certificate has expired
x509: certificate has expired or is not yet valid
Immediate fix:
# On control plane
kubeadm certs renew all
# Restart kubelet
systemctl restart kubelet
# Wait for static pods
sleep 30
kubectl get nodes
Error: certificate signed by unknown authority
x509: certificate signed by unknown authority
Indicates CA mismatch - fundamental trust chain issue. Options:
- Full reset - Use
reset.ymland redeploy (loses all data) - CA rotation - Advanced procedure, see Kubernetes CA rotation docs
Warning: CA rotation is complex and can break the cluster if done incorrectly. Test in non-production first.
kubectl: Unable to connect to server
If certificates expired and kubectl fails:
# Renew admin.conf certificate
kubeadm certs renew admin.conf
# Copy new kubeconfig
cp /etc/kubernetes/admin.conf ~/.kube/config
Certificate Validity Period
Kubespray default: 1 year for most certificates, 10 years for CA.
To change (before deployment):
# group_vars/k8s_cluster/k8s-cluster.yml
kube_cert_validity_duration: 365 # days
Systemd Timer Configuration
# /etc/systemd/system/k8s-cert-renew.timer
[Unit]
Description=Kubernetes Certificate Renewal
[Timer]
OnCalendar=Mon *-*-1,15 03:00:00
Persistent=true
[Install]
WantedBy=timers.target
# /etc/systemd/system/k8s-cert-renew.service
[Unit]
Description=Renew Kubernetes certificates
[Service]
Type=oneshot
ExecStart=/usr/local/bin/kube-scripts/k8s-certs-renew.sh
Monitoring Certificate Expiration
Prometheus Alert Rule
groups:
- name: kubernetes-certificates
rules:
- alert: KubernetesCertificateExpiringSoon
expr: apiserver_client_certificate_expiration_seconds_bucket{le="604800"} > 0
labels:
severity: warning
annotations:
summary: "Kubernetes certificate expiring within 7 days"
Simple Monitoring Script
#!/bin/bash
# cert-check.sh - Run via cron, alert if < 30 days remaining
for cert in /etc/kubernetes/pki/*.crt; do
expiry=$(openssl x509 -in "$cert" -noout -enddate | cut -d= -f2)
expiry_epoch=$(date -d "$expiry" +%s)
now_epoch=$(date +%s)
days_left=$(( (expiry_epoch - now_epoch) / 86400 ))
if [ $days_left -lt 30 ]; then
echo "WARNING: $cert expires in $days_left days"
fi
done
Common Mistakes
| Mistake | Consequence |
|---|---|
Not enabling auto_renew_certificates |
Certs expire, cluster down |
| Forgetting etcd certs | etcd fails, cluster unrecoverable |
| Not restarting components after renewal | Old certs still in use |
| Renewing only some certs | Mismatched certs, auth failures |
| Ignoring expiration warnings | Scrambling at midnight when cluster dies |
Quick Reference
# Check all expirations
kubeadm certs check-expiration
# Renew all certs
kubeadm certs renew all
# Enable auto-renewal in Kubespray
auto_renew_certificates: true
# Verify timer
systemctl list-timers | grep cert
More from sigridjineth/kubespray-skills
rke2-operations
Use when managing RKE2 cluster certificates, performing manual or automated version upgrades, rotating TLS certificates, deploying the System Upgrade Controller, or troubleshooting RKE2 certificate and upgrade errors. Use when seeing "x509 certificate has expired" or "CertificateExpirationWarning" events or "Job has reached the specified backoff limit" errors.
3rke2-deployment
Use when deploying Kubernetes clusters with RKE2 (Rancher Kubernetes Engine 2), configuring server and agent nodes, managing built-in Helm chart addons, or setting up CIS-hardened clusters. Use when seeing "rke2-server failed to start" or "unable to join cluster" errors.
3kubeadm-troubleshooting
Use when kubeadm init fails, join fails, nodes show NotReady, pods stuck Pending, certificate errors, or kubelet crashlooping
3kubeadm-init
Use when initializing a Kubernetes control plane with kubeadm, setting up certificates, static pods, or troubleshooting init failures
2cluster-api
Use when managing Kubernetes clusters as Kubernetes resources with Cluster API (CAPI), provisioning workload clusters from a management cluster, performing declarative upgrades, or working with ClusterClass blueprints. Use when seeing "failed to connect to management cluster" or clusterctl errors.
2kubespray-airgap
Use when deploying Kubernetes in air-gapped or offline environments using kubespray-offline tool, setting up private container registries, staging binaries and images for offline use, configuring containerd registry mirrors, or troubleshooting image pull failures in isolated networks.
2