skills/fred-drake/nix/infrastructure

infrastructure

SKILL.md

Infrastructure Management

Quick Reference

Deploy with Colmena

# Single host
colmena apply --on <hostname> --impure

# Multiple hosts
colmena apply --on host1,host2,host3 --impure

# Build only (no deploy)
colmena build --on <hostname> --impure

Proxmox Container Management

SSH to Proxmox host first, then use pct:

# List containers on a host
ssh <proxmox-host> "pct list"

# Container status
ssh <proxmox-host> "pct status <vmid>"
ssh <proxmox-host> "pct status <vmid> --verbose"

# Start/stop/reboot
ssh <proxmox-host> "pct start <vmid>"
ssh <proxmox-host> "pct stop <vmid>"
ssh <proxmox-host> "pct reboot <vmid>"

# Execute command in container
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/<command>"

# Common commands via pct exec
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/systemctl status <service>"
ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/journalctl -u <service> -n 50"

Server Inventory

Proxmox Hosts

Host Description
thrall Proxmox cluster node
sylvanas Proxmox cluster node
voljin Proxmox cluster node

Proxmox LXC Containers

All other hosts are LXC containers. Use pct list on Proxmox hosts to see VMIDs.

Common hosts: gitea-runner-1/2/3, prometheus, grafana, uptime-kuma, sonarqube, jellyseerr, prowlarr, n8n, minio, scanner, external-metrics, ironforge (gitea, woodpecker, paperless, calibre, nixarr, resume)

NixOS Workstation Services

  • fredpc: glance dashboard (native NixOS module, port 8084)

Troubleshooting Workflows

Container Won't Respond

  1. Check status: ssh <proxmox-host> "pct status <vmid> --verbose"
  2. If running but commands fail: ssh <proxmox-host> "pct reboot <vmid>"
  3. Wait 15-30 seconds, verify: ssh <proxmox-host> "pct status <vmid>"
  4. Re-deploy if needed: colmena apply --on <hostname> --impure

Service Not Working

  1. Check service status:
    ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/systemctl status <service>"
    
  2. Check logs:
    ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/journalctl -u <service> -n 100"
    
  3. Restart service:
    ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/systemctl restart <service>"
    

Podman/Container Issues

Check socket status:

ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/systemctl status podman.socket"

List running containers:

ssh <proxmox-host> "pct exec <vmid> -- /run/current-system/sw/bin/podman ps -a"

SSH Connection Issues

If colmena fails with SSH errors:

  1. Verify container is running on Proxmox
  2. Check if SSH is listening: pct exec <vmid> -- /run/current-system/sw/bin/ss -tlnp | grep 22
  3. Reboot container if necessary

Common Colmena Patterns

Deploy All Gitea Runners

colmena apply --on gitea-runner-1,gitea-runner-2,gitea-runner-3 --impure

Deploy Monitoring Stack

colmena apply --on prometheus,grafana --impure

Update Secrets Before Deploy

just update-secrets
colmena apply --on <hostname> --impure

File Locations

Purpose Path
Colmena host configs colmena/hosts/<hostname>.nix
NixOS host configs modules/nixos/host/<hostname>/configuration.nix
Application configs apps/<appname>.nix
Secrets configs modules/secrets/<hostname>.nix
Container image SHAs apps/fetcher/containers-sha.nix
Container definitions apps/fetcher/containers.toml

Related Skills

  • provision-nixos-server: Create new servers from scratch
  • For creating new hosts, use /provision-nixos-server skill instead
Weekly Installs
16
Repository
fred-drake/nix
GitHub Stars
17
First Seen
Feb 21, 2026
Installed on
opencode16
claude-code16
github-copilot16
codex16
kimi-cli16
gemini-cli16