Virtualization: Hypervisors, VMs, and Infrastructure

Create, configure, and manage virtual machines across hypervisors - from single-node Proxmox setups to multi-node clusters with HA, live migration, and GPU passthrough. The goal is production-ready VM infrastructure with correct storage, memory, and CPU config that won't bite you at 3 AM.

Target versions (verified April 2026):

Tool	Version	Release date	Notes
Proxmox VE	9.1	Nov 2025	Debian 13.2 (trixie), kernel 6.17.2, QEMU 10.1.2
Proxmox Backup Server	4.1	Nov 2025	Dedup, incremental, prune policies
bpg/proxmox (Terraform)	0.100.0	Apr 2026	Primary Proxmox IaC provider
QEMU	10.2.2	Mar 2026	Stable (11.0-rc2 in progress)
libvirt	12.0.0	Jan 2026	Hypervisor abstraction layer
XCP-ng	8.3 LTS	Oct 2024	Xen-based, LTS since Jun 2025, EOL Nov 2028
VMware ESXi	8.0 U3i	Feb 2026	Broadcom-owned, licensing upheaval
VirtualBox	7.2.6	Jan 2026	Dev/testing only
Packer	1.15.1	Mar 2026	Image builder, multi-platform
cloud-init	26.1	Feb 2026	Instance initialization standard

When to use

Creating or configuring VMs on Proxmox, libvirt/KVM, XCP-ng, or VMware
Provisioning Proxmox VMs with Terraform (bpg/proxmox provider)
Building VM templates with Packer and cloud-init
Configuring PCI/GPU passthrough for compute or display GPUs
Managing storage backends (LVM-thin, ZFS, Ceph, NFS)
Setting up Proxmox clustering, HA, and live migration
Troubleshooting VM performance (disk I/O, memory, CPU)
Planning backup strategies (Proxmox Backup Server, snapshots)
Tuning disk performance (virtio-scsi, iothread, discard/fstrim)
Memory management (ballooning, NUMA topology, hugepages)

When NOT to use

Kubernetes manifests, Helm charts, container orchestration (use kubernetes)
General Terraform/OpenTofu HCL patterns, state, modules (use terraform)
Network config not hypervisor-specific: DNS, VPNs, reverse proxies (use networking)
Ansible playbooks and configuration management (use ansible)
Docker/container image optimization (use docker)
OPNsense/pfSense firewall management (use firewall-appliance)

AI Self-Check

AI tools consistently produce the same VM configuration mistakes. Before returning any generated VM config, Terraform HCL, or Packer template, verify against this list:

Workflow

Step 1: Identify the task

Task	Start with	Reference
Proxmox VM creation	CLI (qm) or API (pvesh), cloud-init	`references/proxmox.md`
Terraform provisioning	bpg/proxmox provider, lifecycle rules	`references/proxmox.md` (Terraform section)
Image building	Packer + cloud-init templates	`references/image-building.md`
libvirt/KVM management	virsh, XML domain definitions	`references/libvirt-qemu-kvm.md`
GPU/PCI passthrough	IOMMU groups, vfio-pci	`references/proxmox.md` (PCI section)
Performance tuning	Disk, memory, CPU config	This file + references
Migration to Proxmox	From VMware, XCP-ng, or bare metal	`references/proxmox.md`

Step 2: Gather requirements

Before creating or modifying VMs:

Hypervisor and version - Proxmox VE 9.x? libvirt? VMware migration?
Guest OS - Linux distro, Windows, BSD? (affects virtio drivers, ballooning, agent)
CPU - core count, type (host vs emulated), pinning needs, NUMA topology
Memory - dedicated amount, ballooning (usually: don't), hugepages for databases
Storage - backend (LVM-thin, ZFS, Ceph, NFS), disk size, format (raw vs qcow2)
Network - bridge, VLAN tag, virtio, firewall
Passthrough - GPU/PCI devices, USB, serial ports
Provisioning method - manual, Terraform, Packer template + cloud-init
HA requirements - clustered? live migration? fencing?
Backup strategy - PBS, snapshots, vzdump, frequency

Step 3: Build

Follow the domain-specific reference file. Key principles:

Use cloud-init for provisioning. Don't manually configure VMs after creation. Inject SSH keys, network config, packages, and user accounts via cloud-init.
Use virtio everywhere. Disk (virtio-scsi), network (virtio-net), display (virtio-gpu for headless). IDE and e1000 exist for legacy OS compatibility only.
Pin CPU type to host. Emulated CPU types (kvm64, qemu64) hide features the guest needs (AES-NI, AVX, SSE4). Use host unless you need live migration across heterogeneous hardware.
Test disk config changes with stop/start, not reboot. Guest reboot doesn't restart QEMU - disk config changes (discard, cache, iothread) only take effect after qm stop + qm start.

Step 4: Validate

What to validate	How
VM boots and agent responds	`qm agent <vmid> ping` (Proxmox)
Cloud-init completed	`cloud-init status --wait` in guest
Disk performance	`fio --name=test --rw=randread --bs=4k --direct=1 --numjobs=4 --runtime=30`
TRIM/discard working	`fstrim -v /` in guest, check thin pool data_percent on host
PCI passthrough	`lspci` in guest shows passed device, driver loaded
Network connectivity	`ping gateway`, check `ip addr` matches cloud-init config
Memory	`free -h` in guest matches expected (not balloon-reduced)
Live migration	Test with `qm migrate <vmid> <target> --online` on non-critical VM first

Quick Task: VM from Cloud Image (Proxmox)

The fastest path to a production-ready VM. Skip Packer and ISO installs for standard setups.

Download a cloud image to the Proxmox node: wget -P /var/lib/vz/template/iso/ https://cloud.debian.org/images/cloud/trixie/daily/latest/debian-13-generic-amd64.qcow2
Create the VM shell: qm create 100 --name myvm --memory 2048 --cores 2 --cpu host --net0 virtio,bridge=vmbr0 --agent enabled=1 --scsihw virtio-scsi-pci
Import and attach the disk with SSD optimizations: qm importdisk 100 debian-13-generic-amd64.qcow2 local-lvm qm set 100 --scsi0 local-lvm:vm-100-disk-0,discard=on,iothread=1,ssd=1 --boot order=scsi0
Add cloud-init drive and configure: qm set 100 --ide2 local-lvm:cloudinit qm set 100 --ciuser admin --sshkeys ~/.ssh/id_ed25519.pub --ipconfig0 ip=10.10.10.100/24,gw=10.10.10.1
Start: qm start 100
Verify: qm agent 100 ping (may take 1-2 min on first boot while cloud-init runs)

To make a reusable template, stop the VM after verification and run qm template 100. Clone with qm clone 100 101 --name new-vm --full.

Proxmox VE Quick Reference

Read references/proxmox.md for full coverage of API, CLI, storage backends, clustering, HA, migration, PCI passthrough, backup, and Terraform patterns.

Essential CLI

# VM management
qm list                              # List all VMs
qm create 100 --name test --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0
qm start/stop/shutdown/reset 100     # Power operations
qm config 100                        # Show VM config
qm set 100 --memory 4096             # Modify config (some need stop/start)
qm resize 100 scsi0 +10G             # Extend disk (can't shrink)
qm agent 100 ping                    # Test QEMU agent
qm migrate 100 pve2 --online --with-local-disks  # Live migrate

# Container management
pct list                              # List LXC containers
pct create 200 local:vztmpl/debian-12-standard_12.7-1_amd64.tar.zst

# Storage
pvesm status                          # Storage pool overview
lvs -a -o+devices                     # LVM thin pool status (on LVM backend)

# Cluster
pvecm status                          # Cluster status
pvecm nodes                           # Node list
ha-manager status                     # HA status

Critical gotchas (battle-tested)

These come from production Proxmox environments and will save hours of debugging:

Stop/start vs reboot: Guest reboot does NOT restart the QEMU process. Disk config changes (discard, cache mode, iothread, bus type) only apply when QEMU starts fresh. Always use qm stop then qm start for hardware config changes. This also applies to memory balloon device changes.

LVM thin pool at 100%: When data_percent hits 100%, ALL VM I/O on that pool fails instantly - guests hang, no graceful degradation. Recovery requires lvextend on the thin pool or migrating VMs off. Monitor thin pool usage and alert well before 100% (80% warning, 90% critical). data_percent measures blocks ever written, not current filesystem usage - a VM that wrote then deleted 50GB still shows that 50GB in data_percent until fstrim reclaims it.

Live migration via SSH: qm migrate runs in the foreground. If the SSH session drops, the migration aborts. For large VMs (32GB+ disk), use:

nohup qm migrate <vmid> <target> --online --with-local-disks \
  > /tmp/migrate-<vmid>.log 2>&1 &

Migration is abort-safe: source VM stays running on failure, target LVs are cleaned up.

KVM ballooning: The balloon device lets the host reclaim unused guest memory. Sounds great, causes pain. Alpine Linux (and some BSDs) can't hotplug DIMMs - balloon changes need full power-cycle (stop/start, not reboot). Even on Debian, balloon behavior is unpredictable under memory pressure. Recommendation: disable ballooning (memory_min_mb = 0 in Terraform, or set balloon to 0 in qm) and provision VMs with the memory they actually need.

fail2ban on Proxmox (Debian 13): /var/log/daemon.log doesn't exist under journald. Use backend = systemd with journalmatch = _COMM=pvedaemon in the jail config.

openipmi on non-IPMI hardware: Fails on boot, generates spurious alerts. Safe to systemctl mask openipmi on nodes without BMC/IPMI hardware. Masking survives package updates; disabling doesn't.

Storage Performance

Backend	Best for	Thin provision	Snapshot	Live migration
LVM-thin	Local SSDs, production	Yes	Yes (copy-on-write)	With --with-local-disks
ZFS	Data integrity, compression	Yes	Yes (native)	With replication
Ceph/RBD	Multi-node shared storage	Yes	Yes	Native (shared)
NFS	ISOs, templates, backups	Depends on NAS	Depends	Yes (shared)
local (dir)	Small/test	No (file-based)	qcow2 only	No

Disk interface hierarchy (fastest to slowest):

virtio-scsi-single + iothread - one controller per disk, best IOPS
virtio-scsi-pci + iothread - shared controller, good for most workloads
virtio-blk - legacy virtio, good performance but fewer features
IDE - legacy only, needed for some old OSes

SSD optimization checklist:

ssd = 1 on disk config (tells guest it's on SSD, enables TRIM in guest)
discard = on on QEMU disk (passes TRIM/UNMAP to storage backend)
fstrim.timer enabled in guest (weekly by default on systemd distros)
Verify with: fstrim -v / in guest, then check lvs -o data_percent on host

Disk resize (the Terraform trap): Can't resize disks via the bpg/proxmox Terraform provider. The correct procedure:

qm resize <vmid> scsi0 +10G on the Proxmox host
growpart /dev/sda 1 in the guest (expand partition)
resize2fs /dev/sda1 in the guest (expand filesystem)
Update the disk_size_gb variable in Terraform to match

GPU / PCI Passthrough Quick Reference

Full details (IOMMU groups, ACS override, Terraform patterns) in references/proxmox.md. The minimum viable path for a Windows 11 + NVIDIA desktop GPU on Proxmox VE 9.x:

Host prereqs. Enable VT-d / AMD-Vi in BIOS. Add intel_iommu=on (or amd_iommu=on) plus iommu=pt to the kernel command line (/etc/kernel/cmdline on PVE with systemd-boot, /etc/default/grub on legacy). Run proxmox-boot-tool refresh (or update-grub) and reboot.
Bind to vfio-pci. lspci -nn | grep -i nvidia to find vendor:device IDs, then: echo "options vfio-pci ids=10de:XXXX,10de:YYYY" > /etc/modprobe.d/vfio-pci.conf (GPU + its audio function). Blacklist nouveau and nvidia. update-initramfs -u and reboot. Verify with lspci -nnk | grep -A3 NVIDIA - driver in use must be vfio-pci.
VM settings for Windows 11. Machine type q35, BIOS ovmf (add an EFI disk), TPM v2.0 state disk, cpu: host, hidden=1 to dodge NVIDIA's Code 43 on older drivers. Example: qm set 100 --machine q35 --bios ovmf --cpu host,hidden=1 --efidisk0 local-lvm:1,format=raw
Attach the GPU. Prefer hardware mappings (PVE 8.1+) for migration safety: pvesh create /cluster/mapping/pci --id gpu-rtx4070 --map 'node=pve1,path=0000:01:00.0' then qm set 100 --hostpci0 mapping=gpu-rtx4070,pcie=1,x-vga=1. Legacy form: --hostpci0 01:00,pcie=1,x-vga=1. Drop x-vga for compute-only passthrough.
Check IOMMU groups with find /sys/kernel/iommu_groups/ -type l | sort -V before anything else - every device in the target group gets passed through together. Single-GPU hosts need early vfio binding (initramfs) or the host driver claims it first.

Reset bug: NVIDIA consumer cards (including RTX 4070) usually reset cleanly, but verify with two successive VM restarts before production. AMD RX 5000/6000 series often need the vendor-reset kernel module or pcie_port_pm=off. If the second VM start hangs, you hit it.

Memory Management

Ballooning - the short version: Don't use it unless you've tested it on your exact guest OS and workload. Disable with balloon: 0 in VM config.

NUMA: Enable for VMs with 4+ cores or 8GB+ RAM. Proxmox: numa: 1 in VM config. QEMU auto-creates NUMA nodes matching the host topology.

Hugepages: 2MB or 1GB pages reduce TLB misses. Significant for databases and memory-intensive workloads. Configure on the host:

# Reserve 1024 x 2MB hugepages (2GB total)
echo 1024 > /proc/sys/vm/nr_hugepages
# Persistent: add to /etc/sysctl.d/
vm.nr_hugepages = 1024

Then enable in VM config. Note: hugepages memory can't be shared or ballooned.

CPU hotplug vs memory hotplug: CPU hotplug works live on most modern Linux guests. Memory hotplug (adding DIMMs at runtime) is fragile - Alpine can't do it at all, and even Debian requires specific kernel config. Size memory correctly at creation time.

libvirt/QEMU Quick Start

For libvirt/KVM without Proxmox, the fastest path to a running VM:

# Download a Debian cloud image and resize it
wget -O /var/lib/libvirt/images/myvm.qcow2 \
  https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2
qemu-img resize /var/lib/libvirt/images/myvm.qcow2 20G

# Boot from the cloud image with cloud-init
virt-install --name myvm --ram 2048 --vcpus 2 --cpu host \
  --disk path=/var/lib/libvirt/images/myvm.qcow2,bus=virtio \
  --network network=default,model=virtio \
  --cloud-init user-data=user-data.yaml,meta-data=meta-data.yaml \
  --import --os-variant debian12 --noautoconsole

Build the cloud-init ISO with cloud-localds cidata.iso user-data.yaml meta-data.yaml and attach it as a second disk, or use the --cloud-init flag shown above (virt-install 4.0+).

Reusable template pattern: keep the downloaded cloud image as a read-only golden image (/var/lib/libvirt/images/debian-13-template.qcow2) and create each VM disk as a qcow2 overlay backed by it - qemu-img create -f qcow2 -b debian-13-template.qcow2 -F qcow2 myvm.qcow2. Overlays only store per-VM changes and boot in seconds. Regenerate the base when you need a new OS minor. Validate first boot with cloud-init status --wait inside the guest. Never set password: or chpasswd: with plaintext values in user-data - use ssh_authorized_keys and rely on lock_passwd: true (the cloud-image default).

virsh list --all to confirm state; virsh console myvm to attach. For full XML domain definitions, network and storage pool management, and virsh lifecycle commands, see references/libvirt-qemu-kvm.md.

Hypervisor Selection

Hypervisor	Type	Best for	Avoid when
Proxmox VE	Type 1 (KVM+LXC)	Homelab, SMB, API-driven automation	Need VMware ecosystem tooling
libvirt/KVM	Type 1 (bare)	Custom setups, OpenStack, direct control	Want a GUI or clustering OOB
XCP-ng	Type 1 (Xen)	Xen-based infra, XenOrchestra UI	KVM-specific features (virtio-fs)
VMware ESXi	Type 1	Enterprise with existing VMware investment	Post-Broadcom: licensing costs exploded
VirtualBox	Type 2	Dev workstations, testing	Production. Ever.

VMware post-Broadcom (2026): Broadcom acquired VMware (closed Nov 2023). Perpetual licenses eliminated, subscription-only model, free ESXi discontinued then partially reinstated (ESXi 8.0 U3e "free hypervisor", April 2025). Many organizations are migrating to Proxmox or XCP-ng. The migration path from VMware is well-documented but non-trivial for large estates.

Reference Files

references/proxmox.md - Proxmox VE deep-dive: API, CLI, storage, clustering, HA, live migration, PCI passthrough, Proxmox Backup Server, and Terraform (bpg/proxmox provider patterns, lifecycle gotchas, cloud-init)
references/libvirt-qemu-kvm.md - libvirt/QEMU/KVM: virsh commands, XML domain definitions, QEMU command-line, KVM modules, disk formats, networking
references/image-building.md - Packer templates, cloud-init configuration, cloud image workflows, template management
references/gotchas.md - Battle-tested pitfalls and failure modes from production Proxmox/KVM environments. Read this before any non-trivial change.

Related Skills

terraform - owns HCL patterns, module design, state management. This skill owns Proxmox-specific provider patterns (bpg/proxmox lifecycle rules, cloud-init interface, disk resize workarounds). Use terraform for general IaC; this skill for Proxmox-specific Terraform.
kubernetes - for container orchestration running on top of VMs. This skill provisions the VM infrastructure; kubernetes manages what runs inside the cluster.
networking - for network config not specific to hypervisors (DNS, VPNs, reverse proxies, nftables). This skill covers VM networking (bridges, VLANs, virtio-net).
ansible - for day-2 configuration of VMs after provisioning. This skill creates the VM; ansible configures what runs on it.
docker - for container image optimization. This skill manages VMs that may host Docker/container workloads.

Rules

These are non-negotiable. Violating any of these is a bug.

virtio for everything. Disk (virtio-scsi), network (virtio-net). IDE and e1000 are for legacy OS compatibility only.
CPU type host in production. Emulated types hide features. Only use emulated types for live migration across heterogeneous CPU generations.
Stop/start for hardware changes, not reboot. Guest reboot doesn't restart QEMU. Disk, memory, and device config changes need qm stop + qm start.
No disk resize via Terraform. Use qm resize on host, growpart/resize2fs in guest, then update the Terraform variable.
Disable ballooning by default. Enable only after testing on the specific guest OS.
Monitor thin pool data_percent. Alert at 80%, critical at 90%. At 100%, all I/O fails.
nohup for long migrations. SSH disconnect kills foreground qm migrate.
prevent_destroy + ignore_changes on Terraform VMs. Protect disk and node_name from accidental destruction and migration drift.
Run the AI self-check. Every generated VM config gets verified against the checklist above before returning.
Test before production. New VM configs, passthrough setups, storage backends - test on a non-critical VM first.

virtualization