kubernetes-operations
SKILL.md
Kubernetes Operations
Expert knowledge for Kubernetes cluster management, deployment, and troubleshooting with mastery of kubectl and cloud-native patterns.
Core Expertise
Kubernetes Operations
- Workload Management: Deployments, StatefulSets, DaemonSets, Jobs, and CronJobs
- Networking: Services, Ingress, NetworkPolicies, and DNS configuration
- Configuration & Storage: ConfigMaps, Secrets, PersistentVolumes, and PersistentVolumeClaims
- Troubleshooting: Debugging pods, analyzing logs, and inspecting cluster events
Cluster Operations Process
- Manifest First: Always prefer declarative YAML manifests for resource management
- Validate & Dry-Run: Use
kubectl apply --dry-run=clientto validate changes - Inspect & Verify: After applying changes, verify with
kubectl get,kubectl describe,kubectl logs - Monitor Health: Continuously check status of nodes, pods, and services
- Clean Up: Ensure old or unused resources are properly garbage collected
Essential Commands
# Resource management
kubectl apply -f manifest.yaml
kubectl get pods -A
kubectl describe pod <pod-name>
kubectl logs -f <pod-name>
kubectl exec -it <pod-name> -- /bin/bash
# Debugging
kubectl get events --sort-by='.lastTimestamp'
kubectl top nodes
kubectl top pods --containers
kubectl port-forward <pod-name> 8080:80
# Deployment management
kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name>
# Cluster inspection
kubectl cluster-info
kubectl get nodes -o wide
kubectl api-resources
Key Debugging Patterns
Pod Debugging
# Pod inspection
kubectl describe pod <pod-name>
kubectl get pod <pod-name> -o yaml
kubectl logs <pod-name> --previous
# Interactive debugging
kubectl exec -it <pod-name> -- /bin/bash
kubectl debug <pod-name> -it --image=busybox
kubectl port-forward <pod-name> 8080:80
Networking Troubleshooting
# Service debugging
kubectl get svc -o wide
kubectl get endpoints
kubectl describe svc <service>
# Network connectivity
kubectl run test-pod --image=busybox -it --rm -- sh
# Inside pod: nslookup, wget, nc commands
Common Issues
# CrashLoopBackOff debugging
kubectl logs <pod> --previous
kubectl describe pod <pod>
kubectl get events --field-selector involvedObject.name=<pod>
# Resource constraints
kubectl top pod <pod>
kubectl describe pod <pod> | grep -A 5 Limits
# State management
kubectl state list
kubectl state show <resource>
Best Practices
Context Safety (CRITICAL)
- Always specify
--contextexplicitly in every kubectl command - Never rely on the current context - it may have been changed by another process
- Use
kubectl --context=<context-name> get podsformat for all operations - This prevents accidental operations on the wrong cluster (e.g., running production commands against staging)
# CORRECT: Explicit context
kubectl --context=gke_myproject_us-central1_prod get pods
kubectl --context=staging-cluster apply -f deployment.yaml
# WRONG: Relying on current context
kubectl get pods # Which cluster is this targeting?
Resource Definitions
- Use declarative YAML manifests
- Implement proper labels and selectors
- Define resource requests and limits
- Configure health checks (liveness/readiness probes)
Security
- Use NetworkPolicies to restrict traffic
- Implement RBAC for access control
- Store sensitive data in Secrets
- Run containers as non-root users
Monitoring
- Configure proper logging and metrics
- Set up alerts for critical conditions
- Use health checks and readiness probes
- Monitor resource usage and quotas
Agentic Optimizations
| Context | Command |
|---|---|
| Pod status (structured) | kubectl get pods -n <ns> -o json | jq '.items[] | {name:.metadata.name, status:.status.phase}' |
| Quick overview | kubectl get pods -n <ns> -o wide |
| Events (compact) | kubectl get events -n <ns> --sort-by='.lastTimestamp' -o json |
| Resource details | kubectl get <resource> -o json |
| Logs (bounded) | kubectl logs <pod> -n <ns> --tail=50 |
For detailed debugging commands, troubleshooting patterns, Helm workflows, and advanced K8s operations, see REFERENCE.md.
Weekly Installs
55
Repository
laurigates/clau…-pluginsGitHub Stars
13
First Seen
Jan 29, 2026
Security Audits
Installed on
github-copilot54
opencode54
codex53
gemini-cli53
cursor53
cline53