kubectl
SKILL.md
Kubectl Skill
This skill enables comprehensive Kubernetes cluster management using kubectl and related tools.
Environment
Cluster Aliases
Three cluster/namespace combinations are pre-configured:
| Alias | Cluster | Namespace | Purpose |
|---|---|---|---|
k1 |
AWS EKS Production | production |
生产环境 |
k2 |
AWS EKS Production | staging |
预发布环境 |
k |
K3s (192.168.10.117) | simplex |
本地开发环境 |
Usage:
k1 get pods # 查看生产环境 pods
k2 get pods # 查看预发布环境 pods
k get pods # 查看本地环境 pods
Additional Tools
kubectx- Switch between clusterskubens- Switch between namespacesargocd- GitOps deployments (see separate skill)kargo- Progressive delivery (see separate skill)
Safety Protocol
Dangerous Operations Requiring Confirmation
Before executing any of the following operations, explicitly confirm with the user:
- Delete operations:
delete pod,delete deployment,delete service,delete pvc - Scale to zero:
scale --replicas=0 - Production modifications: Any
k1command that modifies resources - Drain/cordon nodes:
drain,cordon,uncordon - Apply/patch: Changes to production resources
Confirmation Format
⚠️ 危险操作确认
环境: [Production/Staging/Local]
操作: [具体操作描述]
资源: [受影响的资源]
影响: [潜在影响说明]
是否继续执行?
Common Operations Reference
Resource Viewing
Pods
# List pods with status
k1 get pods
k1 get pods -o wide # Include node and IP info
k1 get pods --show-labels # Show labels
k1 get pods -l app=simplex-api # Filter by label
# Pod details
k1 describe pod <pod-name>
# Watch pods in real-time
k1 get pods -w
Deployments
# List deployments
k1 get deployments
k1 get deploy -o wide
# Deployment details
k1 describe deployment <name>
# Rollout status
k1 rollout status deployment/<name>
# Rollout history
k1 rollout history deployment/<name>
Services & Endpoints
# List services
k1 get services
k1 get svc
# Service details with endpoints
k1 describe svc <name>
k1 get endpoints <name>
All Resources
# Get all common resources
k1 get all
# Get specific resource types
k1 get pods,svc,deploy
# Get all resources with labels
k1 get all -l app=simplex-api
Logs & Debugging
Viewing Logs
# Basic logs
k1 logs <pod-name>
# Follow logs (streaming)
k1 logs -f <pod-name>
# Last N lines
k1 logs --tail=100 <pod-name>
# Logs since time
k1 logs --since=1h <pod-name>
k1 logs --since=10m <pod-name>
# Previous container logs (after restart)
k1 logs --previous <pod-name>
# Multi-container pod
k1 logs <pod-name> -c <container-name>
# All containers in pod
k1 logs <pod-name> --all-containers=true
Executing Commands
# Execute command in container
k1 exec <pod-name> -- <command>
# Interactive shell
k1 exec -it <pod-name> -- /bin/sh
k1 exec -it <pod-name> -- /bin/bash
# Specific container in multi-container pod
k1 exec -it <pod-name> -c <container> -- /bin/sh
Debugging
# Pod events and status
k1 describe pod <pod-name>
# Get pod YAML
k1 get pod <pod-name> -o yaml
# Debug with ephemeral container
k1 debug <pod-name> -it --image=busybox
# Check resource usage
k1 top pods
k1 top nodes
Deployment Management
Scaling
# Scale deployment
k1 scale deployment/<name> --replicas=3
# Autoscale
k1 autoscale deployment/<name> --min=2 --max=5 --cpu-percent=80
Rolling Updates
# Update image
k1 set image deployment/<name> <container>=<image>:<tag>
# Rollout status
k1 rollout status deployment/<name>
# Pause/resume rollout
k1 rollout pause deployment/<name>
k1 rollout resume deployment/<name>
# Rollback
k1 rollout undo deployment/<name>
k1 rollout undo deployment/<name> --to-revision=2
Restart
# Restart deployment (rolling restart)
k1 rollout restart deployment/<name>
Configuration Resources
ConfigMaps
# List ConfigMaps
k1 get configmaps
k1 get cm
# View ConfigMap content
k1 describe cm <name>
k1 get cm <name> -o yaml
# Create from file
k1 create configmap <name> --from-file=<path>
# Create from literal
k1 create configmap <name> --from-literal=key=value
Secrets
# List Secrets
k1 get secrets
# View Secret (base64 encoded)
k1 get secret <name> -o yaml
# Decode Secret value
k1 get secret <name> -o jsonpath='{.data.password}' | base64 -d
# Create Secret
k1 create secret generic <name> --from-literal=password=xxx
PersistentVolumeClaims
# List PVCs
k1 get pvc
# PVC details
k1 describe pvc <name>
Network Operations
Port Forwarding
# Forward local port to pod
k1 port-forward pod/<name> 8080:80
# Forward to service
k1 port-forward svc/<name> 8080:80
# Background port-forward
k1 port-forward pod/<name> 8080:80 &
Service Exposure
# Expose deployment as service
k1 expose deployment/<name> --port=80 --target-port=8080
# Get service external IP
k1 get svc <name> -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'
Cluster Management
Nodes
# List nodes
k1 get nodes
k1 get nodes -o wide
# Node details
k1 describe node <name>
# Node resource usage
k1 top nodes
Namespaces
# List namespaces
k1 get namespaces
# Switch namespace (using kubens)
kubens <namespace>
# Create namespace
k1 create namespace <name>
Context Management
# List contexts
kubectx
# Switch context
kubectx <context-name>
# Show current context
kubectl config current-context
Resource Monitoring
# Pod resource usage
k1 top pods
k1 top pods --sort-by=cpu
k1 top pods --sort-by=memory
# Node resource usage
k1 top nodes
# HPA status
k1 get hpa
k1 describe hpa <name>
Output Formatting
For Status Checks
Provide concise summaries:
✅ Pod 状态 (production)
┌──────────────────────────┬─────────┬──────────┬─────────┐
│ Pod │ Status │ Restarts │ Age │
├──────────────────────────┼─────────┼──────────┼─────────┤
│ simplex-api-xxx-abc │ Running │ 0 │ 2d │
│ simplex-api-xxx-def │ Running │ 0 │ 2d │
└──────────────────────────┴─────────┴──────────┴─────────┘
For Troubleshooting
When investigating issues, gather:
- Pod status:
k1 get pod <name> - Pod events:
k1 describe pod <name> - Recent logs:
k1 logs --tail=50 <name> - Resource usage:
k1 top pod <name>
Custom Output Formats
# JSON output
k1 get pods -o json
# YAML output
k1 get pod <name> -o yaml
# Custom columns
k1 get pods -o custom-columns=NAME:.metadata.name,STATUS:.status.phase
# JSONPath
k1 get pods -o jsonpath='{.items[*].metadata.name}'
Troubleshooting Workflows
Pod Not Starting
- Check pod status:
k1 get pod <name> - Check events:
k1 describe pod <name>(look at Events section) - Check logs:
k1 logs <name>ork1 logs --previous <name> - Common issues:
ImagePullBackOff: Check image name and registry credentialsCrashLoopBackOff: Check application logsPending: Check resource requests and node capacity
High Resource Usage
- Check pod usage:
k1 top pods --sort-by=memory - Check node usage:
k1 top nodes - Check HPA status:
k1 get hpa - Consider scaling:
k1 scale deployment/<name> --replicas=N
Service Not Accessible
- Check service:
k1 get svc <name> - Check endpoints:
k1 get endpoints <name> - Check pod labels match service selector
- Test from within cluster:
k1 exec -it <pod> -- curl <service>:<port>
Integration Notes
For GitOps operations (deployments via git), use the ArgoCD and Kargo skills:
- ArgoCD: Application sync, rollback, status
- Kargo: Progressive delivery, freight promotion
For AWS infrastructure operations, use the AWS CLI skill.
Weekly Installs
53
Repository
oldwinter/skillsGitHub Stars
1
First Seen
Jan 20, 2026
Security Audits
Installed on
opencode49
codex45
gemini-cli44
github-copilot41
cursor41
amp37