SKILLS LAUNCH PARTY
skills/jackspace/claudeskillz/infrastructure-skill-builder

infrastructure-skill-builder

SKILL.md

Infrastructure Skill Builder

Convert your infrastructure documentation into powerful, reusable Claude Code skills.

Overview

Infrastructure knowledge is often scattered across:

  • README files
  • Runbooks and wiki pages
  • Configuration files
  • Troubleshooting guides
  • Team Slack/Discord history
  • Mental models of senior engineers

This skill helps you systematically capture that knowledge as Claude Code skills for:

  • Faster onboarding
  • Consistent operations
  • Disaster recovery
  • Knowledge preservation
  • Team scaling

When to Use

Use this skill when:

  • Documenting complex infrastructure setups
  • Creating runbooks for operations teams
  • Onboarding new team members to infrastructure
  • Preserving expert knowledge before team changes
  • Standardizing infrastructure operations
  • Building organizational infrastructure library
  • Migrating from manual to automated operations

Skill Extraction Process

Step 1: Identify Infrastructure Domains

Common domains:

  • Container Orchestration: Docker, Kubernetes, Proxmox LXC
  • Cloud Platforms: AWS, GCP, Azure, DigitalOcean
  • Databases: PostgreSQL, MongoDB, Redis, MySQL
  • Web Servers: Nginx, Apache, Caddy, Traefik
  • Monitoring: Prometheus, Grafana, ELK Stack
  • CI/CD: Jenkins, GitLab CI, GitHub Actions
  • Networking: VPNs, Load Balancers, DNS, Firewalls
  • Storage: S3, MinIO, NFS, Ceph
  • Security: Authentication, SSL/TLS, Firewalls

Step 2: Extract Core Operations

For each domain, document:

  1. Setup/Provisioning: How to create new instances
  2. Configuration: How to configure for different use cases
  3. Operations: Day-to-day management tasks
  4. Troubleshooting: Common issues and resolutions
  5. Scaling: How to scale up/down
  6. Backup/Recovery: Disaster recovery procedures
  7. Monitoring: Health checks and alerts
  8. Security: Security best practices

Infrastructure Skill Template

---
name: [infrastructure-component]-manager
description: Expert guidance for [component] management, provisioning, troubleshooting, and operations
license: MIT
tags: [infrastructure, [component], operations, troubleshooting]
---

# [Component] Manager

Expert knowledge for managing [component] infrastructure.

## Authentication & Access

### Access Methods
```bash
# How to access the infrastructure component
ssh user@host
# or
kubectl config use-context cluster-name

Credentials & Configuration

  • Where credentials are stored
  • How to configure access
  • Common authentication issues

Architecture Overview

Component Topology

  • How components are organized
  • Network topology
  • Resource allocation
  • Redundancy setup

Key Resources

  • Resource 1: Purpose and specs
  • Resource 2: Purpose and specs
  • Resource 3: Purpose and specs

Common Operations

Operation 1: [e.g., Create New Instance]

# Step-by-step commands
command1 --flags
command2 --flags

# Verification
verify-command

Operation 2: [e.g., Update Configuration]

# Commands and explanations

Operation 3: [e.g., Scale Resources]

# Commands and explanations

Monitoring & Health Checks

Check System Status

# Health check commands
status-command

# Expected output
# What healthy output looks like

Common Metrics

  • Metric 1: What it means, normal range
  • Metric 2: What it means, normal range
  • Metric 3: What it means, normal range

Troubleshooting

Issue 1: [Common Problem]

Symptoms: What you observe Cause: Why it happens Fix: Step-by-step resolution

# Fix commands

Issue 2: [Another Problem]

Symptoms: Cause: Fix:

# Fix commands

Backup & Recovery

Backup Procedures

# How to backup
backup-command

# Verification
verify-backup

Recovery Procedures

# How to restore
restore-command

# Verification
verify-restore

Security Best Practices

  • Security practice 1
  • Security practice 2
  • Security practice 3

Quick Reference

Task Command
Task 1 command1
Task 2 command2
Task 3 command3

Additional Resources

  • Official documentation links
  • Related skills
  • External references

## Real-World Example: Proxmox Skill

Based on the proxmox-auth skill in this repository:

### Extracted Knowledge

**From**: Proxmox VE cluster documentation + operational experience

**Structured as**:
1. **Authentication**: SSH access patterns, node IPs
2. **Architecture**: Cluster topology (2 nodes, resources)
3. **Operations**: Container/VM management commands
4. **Troubleshooting**: Common errors and fixes
5. **Networking**: Bridge configuration, IP management
6. **GPU Passthrough**: Special container configurations

**Result**: Comprehensive skill covering:
- Quick access to any node
- Container lifecycle management
- GPU-accelerated containers
- Network troubleshooting
- Backup procedures
- Common gotchas and solutions

## Extraction Scripts

### Extract from Runbooks

```bash
#!/bin/bash
# extract-from-runbook.sh - Convert runbook to skill

RUNBOOK_FILE="$1"
SKILL_NAME="$2"

if [ -z "$RUNBOOK_FILE" ] || [ -z "$SKILL_NAME" ]; then
    echo "Usage: $0 <runbook.md> <skill-name>"
    exit 1
fi

SKILL_DIR="skills/$SKILL_NAME"
mkdir -p "$SKILL_DIR"

# Extract sections from runbook
cat > "$SKILL_DIR/SKILL.md" << EOF
---
name: $SKILL_NAME
description: $(head -5 "$RUNBOOK_FILE" | grep -v "^#" | head -1 | xargs)
license: MIT
extracted-from: $RUNBOOK_FILE
---

# ${SKILL_NAME^}

$(cat "$RUNBOOK_FILE")

---

**Note**: This skill was auto-extracted from runbook documentation.
Review and refine before use.
EOF

echo "✓ Created skill: $SKILL_DIR/SKILL.md"
echo "Review and edit to add:"
echo "  - Metadata and tags"
echo "  - Troubleshooting section"
echo "  - Quick reference"
echo "  - Examples"

Extract from Docker Compose

#!/bin/bash
# docker-compose-to-skill.sh - Extract skill from docker-compose.yaml

COMPOSE_FILE="${1:-docker-compose.yaml}"
PROJECT_NAME=$(basename $(pwd))

SKILL_DIR="skills/docker-$PROJECT_NAME"
mkdir -p "$SKILL_DIR"

# Extract services
SERVICES=$(yq eval '.services | keys | .[]' "$COMPOSE_FILE")

cat > "$SKILL_DIR/SKILL.md" << EOF
---
name: docker-$PROJECT_NAME
description: Docker Compose configuration and management for $PROJECT_NAME
license: MIT
---

# Docker $PROJECT_NAME

Manage Docker Compose stack for $PROJECT_NAME.

## Services

$(yq eval '.services | to_entries | .[] | "### " + .key + "\n" + (.value.image // "custom") + "\n"' "$COMPOSE_FILE")

## Quick Start

\`\`\`bash
# Start all services
docker-compose up -d

# Check status
docker-compose ps

# View logs
docker-compose logs -f

# Stop all services
docker-compose down
\`\`\`

## Service Details

### Ports

$(yq eval '.services | to_entries | .[] | select(.value.ports) | "- **" + .key + "**: " + (.value.ports | join(", "))' "$COMPOSE_FILE")

### Volumes

$(yq eval '.services | to_entries | .[] | select(.value.volumes) | "- **" + .key + "**: " + (.value.volumes | join(", "))' "$COMPOSE_FILE")

## Configuration

See \`$COMPOSE_FILE\` for full configuration.

## Common Operations

### Restart Service

\`\`\`bash
docker-compose restart SERVICE_NAME
\`\`\`

### Update Service

\`\`\`bash
docker-compose pull SERVICE_NAME
docker-compose up -d SERVICE_NAME
\`\`\`

### View Service Logs

\`\`\`bash
docker-compose logs -f SERVICE_NAME
\`\`\`

## Troubleshooting

### Service Won't Start

1. Check logs: \`docker-compose logs SERVICE_NAME\`
2. Verify ports not in use: \`netstat -tulpn | grep PORT\`
3. Check disk space: \`df -h\`

### Network Issues

\`\`\`bash
# Recreate network
docker-compose down
docker network prune
docker-compose up -d
\`\`\`
EOF

echo "✓ Created skill from docker-compose.yaml"

Extract from Kubernetes Manifests

#!/bin/bash
# k8s-to-skill.sh - Extract skill from Kubernetes manifests

K8S_DIR="${1:-.}"
APP_NAME="${2:-$(basename $(pwd))}"

SKILL_DIR="skills/k8s-$APP_NAME"
mkdir -p "$SKILL_DIR"

cat > "$SKILL_DIR/SKILL.md" << EOF
---
name: k8s-$APP_NAME
description: Kubernetes deployment and management for $APP_NAME
license: MIT
---

# Kubernetes $APP_NAME

Manage Kubernetes resources for $APP_NAME.

## Resources

$(find "$K8S_DIR" -name "*.yaml" -o -name "*.yml" | while read file; do
    KIND=$(yq eval '.kind' "$file" 2>/dev/null)
    NAME=$(yq eval '.metadata.name' "$file" 2>/dev/null)
    echo "- **$KIND**: $NAME ($(basename $file))"
done)

## Deployment

### Apply All Resources

\`\`\`bash
kubectl apply -f $K8S_DIR/
\`\`\`

### Check Status

\`\`\`bash
# Pods
kubectl get pods -l app=$APP_NAME

# Services
kubectl get svc -l app=$APP_NAME

# Deployments
kubectl get deploy -l app=$APP_NAME
\`\`\`

## Common Operations

### Scale Deployment

\`\`\`bash
kubectl scale deployment $APP_NAME --replicas=3
\`\`\`

### Update Image

\`\`\`bash
kubectl set image deployment/$APP_NAME container=new-image:tag
\`\`\`

### View Logs

\`\`\`bash
kubectl logs -f deployment/$APP_NAME
\`\`\`

### Port Forward

\`\`\`bash
kubectl port-forward svc/$APP_NAME 8080:80
\`\`\`

## Troubleshooting

### Pod Not Starting

\`\`\`bash
# Check pod events
kubectl describe pod POD_NAME

# Check logs
kubectl logs POD_NAME

# Previous instance logs
kubectl logs POD_NAME --previous
\`\`\`

### Service Not Reachable

\`\`\`bash
# Check endpoints
kubectl get endpoints $APP_NAME

# Check service
kubectl describe svc $APP_NAME

# Test from another pod
kubectl run -it --rm debug --image=busybox --restart=Never -- wget -O- http://$APP_NAME
\`\`\`

## Quick Reference

| Task | Command |
|------|---------|
| Apply | \`kubectl apply -f $K8S_DIR/\` |
| Status | \`kubectl get all -l app=$APP_NAME\` |
| Logs | \`kubectl logs -f deployment/$APP_NAME\` |
| Scale | \`kubectl scale deployment $APP_NAME --replicas=N\` |
| Delete | \`kubectl delete -f $K8S_DIR/\` |
EOF

echo "✓ Created Kubernetes skill"

Infrastructure Patterns to Capture

Pattern 1: SSH Access Matrix

## SSH Access

| Host | IP | Purpose | Access |
|------|--------|---------|--------|
| node1 | 192.168.1.10 | Primary | `ssh node1` |
| node2 | 192.168.1.11 | Secondary | `ssh node2` |
| bastion | 203.0.113.5 | Jump host | `ssh -J bastion node1` |

Pattern 2: Service Port Mapping

## Service Ports

| Service | Internal | External | Protocol |
|---------|----------|----------|----------|
| Web | 8080 | 80 | HTTP |
| API | 3000 | 443 | HTTPS |
| DB | 5432 | - | TCP |

Pattern 3: Configuration Files

## Configuration Locations

### Application Config
- Path: `/etc/app/config.yaml`
- Format: YAML
- Requires restart: Yes

### Database Config
- Path: `/var/lib/postgres/postgresql.conf`
- Format: INI
- Requires restart: Yes

Pattern 4: Command Workflows

## Deployment Workflow

1. **Backup current state**
   ```bash
   ./backup.sh
  1. Pull latest code

    git pull origin main
    
  2. Build application

    docker build -t app:latest .
    
  3. Deploy

    docker-compose up -d
    
  4. Verify

    curl http://localhost/health
    

## Best Practices

### ✅ DO

1. **Document assumptions** - What's required before operations
2. **Include verification** - How to verify each operation succeeded
3. **Add troubleshooting** - Common issues and fixes
4. **Show outputs** - Expected command outputs
5. **Link resources** - Related documentation and skills
6. **Version information** - Software versions, configurations
7. **Security notes** - Security implications of operations
8. **Update regularly** - Keep skills current with infrastructure

### ❌ DON'T

1. **Don't hardcode secrets** - Use placeholders or env vars
2. **Don't skip context** - Explain why, not just how
3. **Don't assume knowledge** - Explain terminology
4. **Don't omit edge cases** - Document special scenarios
5. **Don't forget cleanup** - Include teardown procedures
6. **Don't ignore dependencies** - Document prerequisites
7. **Don't skip testing** - Verify all commands work
8. **Don't leave TODO** - Complete all sections

## Quality Checklist

- [ ] Clear component description
- [ ] Authentication/access documented
- [ ] Architecture overview provided
- [ ] Common operations with examples
- [ ] Troubleshooting section complete
- [ ] Health checks documented
- [ ] Backup/recovery procedures
- [ ] Security considerations noted
- [ ] Quick reference table
- [ ] All commands tested
- [ ] No hardcoded secrets
- [ ] Links to resources

## Quick Start Workflow

```bash
# 1. Identify infrastructure component
COMPONENT="nginx-reverse-proxy"

# 2. Gather documentation
# - Collect README files
# - Export wiki pages
# - Capture team knowledge
# - Document current setup

# 3. Create skill structure
mkdir -p skills/$COMPONENT

# 4. Fill in template
# Use the Infrastructure Skill Template above

# 5. Test all commands
# Verify every command in skill works

# 6. Review and refine
# Have team review for completeness

# 7. Commit to repository
git add skills/$COMPONENT
git commit -m "docs: Add $COMPONENT infrastructure skill"

Version: 1.0.0 Author: Harvested from proxmox-auth skill pattern Last Updated: 2025-11-18 License: MIT Key Principle: Convert tribal knowledge into structured, searchable, actionable skills.

Examples in This Repository

  • proxmox-auth: Proxmox VE cluster management
  • docker-*: Docker-based infrastructure
  • cloudflare-*: Cloudflare infrastructure services

Transform your infrastructure documentation into skills today! 🏗️

Weekly Installs
9
First Seen
Jan 24, 2026
Installed on
claude-code7
gemini-cli6
antigravity6
opencode6
windsurf5
trae5