NYC
skills/smithery/ai/proxmox-infrastructure

proxmox-infrastructure

SKILL.md

Proxmox Infrastructure: Matrix/Virgo-Core Cluster

Manage 3-node Proxmox VE cluster with CEPH storage, VLAN networking, and cloud-init automation.

Trigger Phrases

Use this skill when you see:

  • "Clone the Ubuntu template to create a new VM"
  • "Check CEPH storage health on the cluster"
  • "Configure VLAN 30 networking for a Proxmox VM"
  • "Create a cloud-init template from Ubuntu noble image"
  • "What's the status of nodes Foxtrot, Golf, or Hotel?"
  • "Deploy VMs using Terraform and the Proxmox provider"
  • "Configure MTU 9000 for CEPH storage networks"
  • "Troubleshoot VM deployment or networking issues"
  • "Use Ansible community.general.proxmox modules"
  • "Validate template configuration via Proxmox API"

Available Tools

Python Scripts (uv-based)

  • validate_template.py - Validate template health via Proxmox API
  • cluster_status.py - Cluster health metrics and node status
  • check_ceph_health.py - CEPH storage pool health monitoring
  • check_cluster_health.py - Comprehensive cluster diagnostics

All scripts support --help for usage. Run with: uv run tools/<script>.py

Automation Examples

  • Ansible playbooks - Template creation, VLAN bridging setup
  • Terraform modules - VM cloning, multi-node deployments with dual NICs

See examples/ and workflows/ for working configurations.

Core Capabilities

Template Management:

  • Ubuntu/Debian cloud-init templates with virtio-scsi
  • Serial console configuration for cloud images
  • Proper boot order and cloud-init CD-ROM (ide2)

Network Infrastructure:

  • vmbr0: Management (VLAN 9 for Corosync)
  • vmbr1: CEPH Public (192.168.5.0/24, MTU 9000)
  • vmbr2: CEPH Private (192.168.7.0/24, MTU 9000)
  • VLAN-aware bridging (supports VLANs 2-4094)
  • 802.3ad LACP bonding

CEPH Storage:

  • Multi-OSD configuration (2× 4TB NVMe per node)
  • MTU 9000 for storage networks
  • Public/private network separation
  • Health monitoring and diagnostics

API Automation:

  • Python via proxmoxer library
  • Ansible via community.general.proxmox_* modules
  • Terraform via Telmate/proxmox provider

Cluster Architecture (Matrix)

Hardware: 3× MINISFORUM MS-A2

  • AMD Ryzen 9 9955HX (16C/32T)
  • 64GB DDR5 @ 5600 MT/s
  • 3× NVMe: 1× 1TB boot, 2× 4TB CEPH OSDs
  • 4× NICs: 2× 10GbE SFP+, 2× 2.5GbE

Nodes: Foxtrot, Golf, Hotel

Quick Examples

Clone template to VM:

qm clone 9000 101 --name web-01
qm set 101 --ipconfig0 ip=192.168.3.100/24,gw=192.168.3.1
qm set 101 --net0 virtio,bridge=vmbr0,tag=30
qm start 101

Check cluster health:

uv run tools/cluster_status.py
uv run tools/check_ceph_health.py

For Details

  • reference/ - Cloud-init, networking, API, storage, QEMU guest agent
  • workflows/ - Cluster formation, CEPH deployment automation
  • examples/ - Terraform configs, Ansible playbooks
  • anti-patterns/ - Common mistakes from real deployments
Weekly Installs
1
Repository
smithery/ai
First Seen
11 days ago
Installed on
claude-code1