terragrunt

SKILL.md

Terragrunt Infrastructure Skill

Manage bare-metal Kubernetes infrastructure from PXE boot to running clusters.

For architecture overview (units vs modules, config centralization), see infrastructure/CLAUDE.md. For detailed unit patterns, see infrastructure/units/CLAUDE.md.

Task Commands (Always Use These)

# Validation (run in order)
task tg:fmt                    # Format HCL files
task tg:test-<module>          # Test specific module (e.g., task tg:test-config)
task tg:validate-<stack>       # Validate stack (e.g., task tg:validate-integration)

# Operations
task tg:list                   # List available stacks
task tg:plan-<stack>           # Plan (e.g., task tg:plan-integration)
task tg:apply-<stack>          # Apply (REQUIRES HUMAN APPROVAL)
task tg:gen-<stack>            # Generate stack files
task tg:clean-<stack>          # Clean generated files

NEVER run terragrunt or tofu directly—always use task commands.

How to Add a Machine

  1. Edit inventory.hcl:
node50 = {
  cluster = "live"
  type    = "worker"
  install = {
    selector     = "disk.model == 'Samsung'"
    architecture = "amd64"
  }
  interfaces = [{
    id           = "eth0"
    hardwareAddr = "aa:bb:cc:dd:ee:ff"  # VERIFY correct
    addresses    = [{ ip = "192.168.10.50" }]  # VERIFY available
  }]
}
  1. Run task tg:plan-live
  2. Review plan—config module auto-includes machines where cluster == "live"
  3. Request human approval before apply

How to Add a Feature Flag

  1. Add version to versions.hcl if needed
  2. Add feature detection in modules/config/main.tf:
locals {
  new_feature_enabled = contains(var.features, "new-feature")
}
  1. Enable in stack's features list:
features = ["gateway-api", "longhorn", "new-feature"]

How to Create a New Unit

  1. Create units/new-unit/terragrunt.hcl:
include "root" {
  path = find_in_parent_folders("root.hcl")
}

terraform {
  source = "../../../.././/modules/new-unit"
}

dependency "config" {
  config_path = "../config"
  mock_outputs = { new_unit = {} }
}

inputs = dependency.config.outputs.new_unit
  1. Create corresponding modules/new-unit/ with variables.tf, main.tf, outputs.tf, versions.tf
  2. Add output from config module
  3. Add unit block to stacks that need it

How to Write Module Tests

Tests use OpenTofu native testing in modules/<name>/tests/*.tftest.hcl:

# Top-level variables set defaults for ALL run blocks
variables {
  name     = "test-cluster"
  features = ["gateway-api"]
  machines = {
    node1 = {
      cluster = "test-cluster"
      type    = "controlplane"
      # ... complete machine definition
    }
  }
}

run "feature_enabled" {
  command = plan
  variables {
    features = ["prometheus"]  # Only override what differs
  }
  assert {
    condition     = output.prometheus_enabled == true
    error_message = "Prometheus should be enabled"
  }
}

Run with task tg:test-config or task tg:test for all modules.

Safety Rules

  • NEVER run apply without explicit human approval
  • NEVER use --auto-approve flags
  • NEVER guess MAC addresses or IPs—verify against inventory.hcl
  • NEVER commit .terragrunt-cache/ or .terragrunt-stack/
  • NEVER manually edit Terraform state

State Operations

When removing state entries with indexed resources (e.g., this["rpi4"]), xargs strips the quotes causing errors. Use a while loop instead:

# WRONG - xargs mangles quotes in resource names
terragrunt state list | xargs -n 1 terragrunt state rm

# CORRECT - while loop preserves quotes
terragrunt state list | while read -r resource; do terragrunt state rm "$resource"; done

This applies to any state operation on resources with map keys like data.talos_machine_configuration.this["rpi4"].

Validation Checklist

Before requesting apply approval:

  • task tg:fmt passes
  • task tg:test passes (if module tests exist)
  • task tg:validate passes for ALL stacks
  • task tg:plan-<stack> reviewed
  • No unexpected destroys in plan
  • Network changes won't break connectivity

References

  • stacks.md - Detailed Terragrunt stacks documentation
  • units.md - Detailed Terragrunt units documentation
Weekly Installs
51
Repository
ionfury/homelab
GitHub Stars
22
First Seen
Jan 25, 2026
Installed on
cursor51
gemini-cli51
github-copilot51
codex51
opencode51
amp50