infrastructure
SKILL.md
Infrastructure Management
Quick Reference
Deploy with Colmena
# Single host
colmena apply --on <hostname> --impure
# Multiple hosts
colmena apply --on host1,host2,host3 --impure
# Build only (no deploy)
colmena build --on <hostname> --impure
Proxmox Container Management
SSH to Proxmox host first, then use pct:
# List containers on a host
ssh <proxmox-host> "pct list"
# Container status
ssh <proxmox-host> "pct status <vmid>"
ssh <proxmox-host> "pct status <vmid> --verbose"
# Start/stop/reboot
ssh <proxmox-host> "pct start <vmid>"
ssh <proxmox-host> "pct stop <vmid>"
ssh <proxmox-host> "pct reboot <vmid>"
# Execute command in container
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/<command>"
# Common commands via pct exec
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/systemctl status <service>"
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/journalctl -u <service> -n 50"
Server Inventory
Proxmox Hosts
| Host | Description |
|---|---|
| thrall | Proxmox cluster node |
| sylvanas | Proxmox cluster node |
| voljin | Proxmox cluster node |
Proxmox LXC Containers
All other hosts are LXC containers. Use pct list on Proxmox hosts to see VMIDs.
Common hosts: gitea-runner-1/2/3, prometheus, grafana, uptime-kuma, sonarqube, jellyseerr, prowlarr, n8n, minio, scanner, external-metrics, ironforge (gitea, woodpecker, paperless, calibre, nixarr, resume)
NixOS Workstation Services
fredpc: glance dashboard (native NixOS module, port 8084)
Troubleshooting Workflows
Container Won't Respond
- Check status:
ssh <proxmox-host> "pct status <vmid> --verbose" - If running but commands fail:
ssh <proxmox-host> "pct reboot <vmid>" - Wait 15-30 seconds, verify:
ssh <proxmox-host> "pct status <vmid>" - Re-deploy if needed:
colmena apply --on <hostname> --impure
Service Not Working
- Check service status:
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/systemctl status <service>" - Check logs:
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/journalctl -u <service> -n 100" - Restart service:
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/systemctl restart <service>"
Podman/Container Issues
Check socket status:
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/systemctl status podman.socket"
List running containers:
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/podman ps -a"
SSH Connection Issues
If colmena fails with SSH errors:
- Verify container is running on Proxmox
- Check if SSH is listening:
pct exec <vmid> -- /run/current-system/sw/bin/ss -tlnp | grep 22 - Reboot container if necessary
Common Colmena Patterns
Deploy All Gitea Runners
colmena apply --on gitea-runner-1,gitea-runner-2,gitea-runner-3 --impure
Deploy Monitoring Stack
colmena apply --on prometheus,grafana --impure
Update Secrets Before Deploy
just update-secrets
colmena apply --on <hostname> --impure
File Locations
| Purpose | Path |
|---|---|
| Colmena host configs | colmena/hosts/<hostname>.nix |
| NixOS host configs | modules/nixos/host/<hostname>/configuration.nix |
| Application configs | apps/<appname>.nix |
| Secrets configs | modules/secrets/<hostname>.nix |
| Container image SHAs | apps/fetcher/containers-sha.nix |
| Container definitions | apps/fetcher/containers.toml |
Related Skills
- provision-nixos-server: Create new servers from scratch
- For creating new hosts, use
/provision-nixos-serverskill instead
Weekly Installs
16
Repository
fred-drake/nixGitHub Stars
17
First Seen
Feb 21, 2026
Security Audits
Installed on
opencode16
claude-code16
github-copilot16
codex16
kimi-cli16
gemini-cli16