k8s
Installation
SKILL.md
ACCESSING CLUSTERS
The available clusters are dev, integration, and live. Use the coresponding kube context for access.
Common kubectl Patterns
| Command | Purpose |
|---|---|
kubectl get pods -n <ns> |
List pods in a namespace |
kubectl get pods -A |
List pods across all namespaces |
kubectl describe pod <pod> -n <ns> |
Detailed pod info with events |
kubectl logs <pod> -n <ns> --tail=100 |
Recent logs from a pod |
kubectl logs <pod> -n <ns> --previous |
Logs from previous container instance |
kubectl get events -n <ns> --sort-by='.lastTimestamp' |
Recent events timeline |
kubectl top pods -n <ns> |
CPU/memory usage per pod |
kubectl top nodes |
CPU/memory usage per node |
kubectl get ns <ns> --show-labels |
Namespace labels (network policy profiles) |
kubectl explain <resource> |
API schema reference for a resource type |
Flux GitOps Commands
# Check status
flux --context <cluster> get all
flux --context <cluster> get kustomizations
flux --context <cluster> get helmreleases -A
# Trigger reconciliation
flux --context <cluster> reconcile source git flux-system
flux --context <cluster> reconcile kustomization <name>
flux --context <cluster> reconcile helmrelease <name> -n <namespace>
Flux Status Interpretation
| Status | Meaning | Action |
|---|---|---|
Ready: True |
Reconciled and healthy | None |
Ready: False |
Failed to reconcile | Check the message/reason |
Stalled: True |
Stopped retrying after repeated failures | Suspend/resume to reset (see sre skill) |
Suspended: True |
Intentionally paused | Resume: flux resume <type> <name> |
Reconciling |
Actively being applied | Wait for completion |
Researching Unfamiliar Services
When investigating unknown services, spawn a haiku agent to research documentation:
Task tool:
- subagent_type: "general-purpose"
- model: "haiku"
- prompt: "Research [service] troubleshooting docs. Focus on:
1. Common failure modes
2. Health indicators
3. Configuration gotchas
Start with: [docs-url]"
Chart URL to Docs mapping:
| Chart Source | Documentation |
|---|---|
charts.jetstack.io |
cert-manager.io/docs |
charts.longhorn.io |
longhorn.io/docs |
grafana.github.io |
grafana.com/docs |
prometheus-community.github.io |
prometheus.io/docs |
Common Confusions
BAD: Use helm list to check Helm release status
GOOD: Use kubectl get helmrelease -A — Flux manages releases via CRDs, not Helm CLI
Related skills
More from ionfury/homelab
prometheus
Query Prometheus API for cluster metrics, alerts, and observability data. Use when investigating cluster health, performance issues, resource utilization, or alert status. Triggers on questions like "what's the CPU usage", "show me firing alerts", "check memory pressure", "query prometheus for", or any PromQL-related requests.
66taskfiles
|
63opentofu-modules
|
59terragrunt
|
59cnpg-database
|
37self-improvement
|
36