kubernetes-specialist
SKILL.md
Kubernetes Specialist
When to Use This Skill
- Deploying workloads (Deployments, StatefulSets, DaemonSets, Jobs)
- Configuring networking (Services, Ingress, NetworkPolicies)
- Managing configuration (ConfigMaps, Secrets, environment variables)
- Setting up persistent storage (PV, PVC, StorageClasses)
- Creating Helm charts for application packaging
- Troubleshooting cluster and workload issues
- Implementing security best practices
Core Workflow
- Analyze requirements — Understand workload characteristics, scaling needs, security requirements
- Design architecture — Choose workload types, networking patterns, storage solutions
- Implement manifests — Create declarative YAML with proper resource limits, health checks
- Secure — Apply RBAC, NetworkPolicies, Pod Security Standards, least privilege
- Validate — Run
kubectl rollout status,kubectl get pods -w, andkubectl describe pod <name>to confirm health; roll back withkubectl rollout undoif needed
Reference Guide
Load detailed guidance based on context:
| Topic | Reference | Load When |
|---|---|---|
| Workloads | references/workloads.md |
Deployments, StatefulSets, DaemonSets, Jobs, CronJobs |
| Networking | references/networking.md |
Services, Ingress, NetworkPolicies, DNS |
| Configuration | references/configuration.md |
ConfigMaps, Secrets, environment variables |
| Storage | references/storage.md |
PV, PVC, StorageClasses, CSI drivers |
| Helm Charts | references/helm-charts.md |
Chart structure, values, templates, hooks, testing, repositories |
| Troubleshooting | references/troubleshooting.md |
kubectl debug, logs, events, common issues |
| Custom Operators | references/custom-operators.md |
CRD, Operator SDK, controller-runtime, reconciliation |
| Service Mesh | references/service-mesh.md |
Istio, Linkerd, traffic management, mTLS, canary |
| GitOps | references/gitops.md |
ArgoCD, Flux, progressive delivery, sealed secrets |
| Cost Optimization | references/cost-optimization.md |
VPA, HPA tuning, spot instances, quotas, right-sizing |
| Multi-Cluster | references/multi-cluster.md |
Cluster API, federation, cross-cluster networking, DR |
Constraints
MUST DO
- Use declarative YAML manifests (avoid imperative kubectl commands)
- Set resource requests and limits on all containers
- Include liveness and readiness probes
- Use secrets for sensitive data (never hardcode credentials)
- Apply least privilege RBAC permissions
- Implement NetworkPolicies for network segmentation
- Use namespaces for logical isolation
- Label resources consistently for organization
- Document configuration decisions in annotations
MUST NOT DO
- Deploy to production without resource limits
- Store secrets in ConfigMaps or as plain environment variables
- Use default ServiceAccount for application pods
- Allow unrestricted network access (default allow-all)
- Run containers as root without justification
- Skip health checks (liveness/readiness probes)
- Use latest tag for production images
- Expose unnecessary ports or services
Common YAML Patterns
Deployment with resource limits, probes, and security context
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: my-namespace
labels:
app: my-app
version: "1.2.3"
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
version: "1.2.3"
spec:
serviceAccountName: my-app-sa # never use default SA
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: my-app
image: my-registry/my-app:1.2.3 # never use latest
ports:
- containerPort: 8080
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
envFrom:
- secretRef:
name: my-app-secret # pull credentials from Secret, not ConfigMap
Minimal RBAC (least privilege)
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app-sa
namespace: my-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: my-app-role
namespace: my-namespace
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list"] # grant only what is needed
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-app-rolebinding
namespace: my-namespace
subjects:
- kind: ServiceAccount
name: my-app-sa
namespace: my-namespace
roleRef:
kind: Role
name: my-app-role
apiGroup: rbac.authorization.k8s.io
NetworkPolicy (default-deny + explicit allow)
# Deny all ingress and egress by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: my-namespace
spec:
podSelector: {}
policyTypes: ["Ingress", "Egress"]
---
# Allow only specific traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-my-app
namespace: my-namespace
spec:
podSelector:
matchLabels:
app: my-app
policyTypes: ["Ingress"]
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
Validation Commands
After deploying, verify health and security posture:
# Watch rollout complete
kubectl rollout status deployment/my-app -n my-namespace
# Stream pod events to catch crash loops or image pull errors
kubectl get pods -n my-namespace -w
# Inspect a specific pod for failures
kubectl describe pod <pod-name> -n my-namespace
# Check container logs
kubectl logs <pod-name> -n my-namespace --previous # use --previous for crashed containers
# Verify resource usage vs. limits
kubectl top pods -n my-namespace
# Audit RBAC permissions for a service account
kubectl auth can-i --list --as=system:serviceaccount:my-namespace:my-app-sa
# Roll back a failed deployment
kubectl rollout undo deployment/my-app -n my-namespace
Output Templates
When implementing Kubernetes resources, provide:
- Complete YAML manifests with proper structure
- RBAC configuration if needed (ServiceAccount, Role, RoleBinding)
- NetworkPolicy for network isolation
- Brief explanation of design decisions and security considerations
Weekly Installs
2.7K
Repository
jeffallan/claude-skillsGitHub Stars
6.6K
First Seen
Jan 21, 2026
Security Audits
Installed on
opencode2.3K
gemini-cli2.2K
codex2.2K
github-copilot2.2K
amp2.0K
kimi-cli2.0K