iac-terraform
Infrastructure as Code - Terraform & Terragrunt
Comprehensive guidance for infrastructure as code using Terraform and Terragrunt, from development through production deployment.
Core Workflows
1. New Infrastructure Development
Workflow Decision Tree:
Is this reusable across environments/projects?
├─ Yes → Create a Terraform module
│ └─ See "Creating Terraform Modules" below
└─ No → Create environment-specific configuration
└─ See "Environment Configuration" below
Creating Terraform Modules
When building reusable infrastructure:
- Scaffold new module with script:
python3 scripts/init_module.py my-module-name
This automatically creates:
- Standard module file structure
- Template files with proper formatting
- Examples directory
- README with documentation
-
Use module template structure:
- See
assets/templates/MODULE_TEMPLATE.mdfor complete structure - Required files:
main.tf,variables.tf,outputs.tf,versions.tf,README.md - Recommended:
examples/directory with working examples
- See
-
Follow module best practices:
- Single responsibility - one module, one purpose
- Sensible defaults for optional variables
- Complete descriptions for all variables and outputs
- Input validation using
validationblocks - Mark sensitive values with
sensitive = true
-
Validate module:
python3 scripts/validate_module.py /path/to/module
This checks for:
- Required files present
- Variables have descriptions and types
- Outputs have descriptions
- README exists and is complete
- Naming conventions followed
- Sensitive values properly marked
- Test module:
cd examples/complete
terraform init
terraform plan
- Document module:
- Use terraform-docs to auto-generate:
terraform-docs markdown . > README.md - Include usage examples
- Document all inputs and outputs
- Use terraform-docs to auto-generate:
Key Module Patterns:
See references/best_practices.md "Module Design" section for:
- Composability patterns
- Variable organization
- Output design
- Module versioning strategies
Environment Configuration
For environment-specific infrastructure:
- Structure by environment:
environments/
├── dev/
├── staging/
└── prod/
- Use consistent file organization:
environment/
├── main.tf # Resource definitions
├── variables.tf # Variable declarations
├── terraform.tfvars # Default values (committed)
├── secrets.auto.tfvars # Sensitive values (.gitignore)
├── backend.tf # State configuration
├── outputs.tf # Output values
└── versions.tf # Version constraints
- Reference modules:
module "vpc" {
source = "git::https://github.com/company/terraform-modules.git//vpc?ref=v1.2.0"
name = "${var.environment}-vpc"
vpc_cidr = var.vpc_cidr
environment = var.environment
}
2. State Management & Inspection
When to inspect state:
- Before major changes
- Investigating drift
- Debugging resource issues
- Auditing infrastructure
Inspect state and check health:
# List all managed resources
terraform state list
# Show detailed state for a specific resource
terraform state show <resource_address>
# Show full state summary (all resources, outputs, providers)
terraform show
Check for drift:
# Exit code 0 = no changes, 1 = error, 2 = drift detected
terraform plan -detailed-exitcode
State operations:
# List all resources
terraform state list
# Show specific resource
terraform state show aws_instance.web
# Remove from state (doesn't destroy)
terraform state rm aws_instance.web
# Move/rename resource
terraform state mv aws_instance.web aws_instance.web_server
# Import existing resource
terraform import aws_instance.web i-1234567890abcdef0
State best practices: See references/best_practices.md "State Management" section for:
- Remote backend setup (S3 + DynamoDB)
- State file organization strategies
- Encryption and security
- Backup and recovery procedures
3. Standard Terraform Workflow
# 1. Initialize (first time or after module changes)
terraform init
# 2. Format code
terraform fmt -recursive
# 3. Validate syntax
terraform validate
# 4. Plan changes (always review!)
terraform plan -out=tfplan
# 5. Apply changes
terraform apply tfplan
# 6. Verify outputs
terraform output
With Terragrunt:
# Run for single module
terragrunt plan
terragrunt apply
# Run for all modules in directory tree
terragrunt run-all plan
terragrunt run-all apply
4. Troubleshooting Issues
When encountering errors:
-
Read the complete error message - Don't skip details
-
Consult
references/troubleshooting.mdwhich covers:- State lock errors
- State drift/corruption
- Provider authentication failures
- Resource errors (already exists, dependency errors, timeouts)
- Module source issues
- Terragrunt-specific issues (dependency cycles, hooks)
- Performance problems
-
Enable debug logging if needed:
export TF_LOG=DEBUG
export TF_LOG_PATH=terraform-debug.log
terraform plan
- Isolate the problem:
# Test specific resource
terraform plan -target=aws_instance.web
terraform apply -target=aws_instance.web
- Common quick fixes:
State locked (full resolution guide: references/troubleshooting.md → State Lock Error):
# Verify no one else running, then:
terraform force-unlock <lock-id>
Provider cache issues:
rm -rf .terraform
terraform init -upgrade
Module cache issues:
rm -rf .terraform/modules
terraform init
5. Code Review & Quality
Before committing:
- Format code:
terraform fmt -recursive
- Validate syntax:
terraform validate
- Lint with tflint:
tflint --module
- Security scan with checkov:
checkov -d .
- Validate modules:
python3 scripts/validate_module.py modules/vpc
- Generate documentation:
terraform-docs markdown modules/vpc > modules/vpc/README.md
Review checklist:
- All variables have descriptions
- Sensitive values marked as sensitive
- Outputs have descriptions
- Resources follow naming conventions
- No hardcoded values (use variables)
- README is complete and current
- Examples directory exists and works
- Version constraints specified
- Security best practices followed
See references/best_practices.md for comprehensive guidelines.
Terragrunt Patterns
Project Structure
terragrunt-project/
├── terragrunt.hcl # Root config
├── account.hcl # Account-level vars
├── region.hcl # Region-level vars
└── environments/
├── dev/
│ ├── env.hcl # Environment vars
│ └── us-east-1/
│ ├── vpc/
│ │ └── terragrunt.hcl
│ └── eks/
│ └── terragrunt.hcl
└── prod/
└── us-east-1/
├── vpc/
└── eks/
Dependency Management
# In eks/terragrunt.hcl
dependency "vpc" {
config_path = "../vpc"
# Mock outputs for plan/validate
mock_outputs = {
vpc_id = "vpc-mock"
subnet_ids = ["subnet-mock"]
}
mock_outputs_allowed_terraform_commands = ["validate", "plan"]
}
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id
subnet_ids = dependency.vpc.outputs.private_subnet_ids
}
Common Patterns
See assets/templates/MODULE_TEMPLATE.md for complete Terragrunt configuration templates including:
- Root terragrunt.hcl with provider generation
- Remote state configuration
- Module-level terragrunt.hcl patterns
- Dependency handling
Reference Documentation
references/best_practices.md— Project structure, state management, module design, security, CI/CD integrationreferences/troubleshooting.md— State lock errors, drift, provider issues, resource errors, Terragrunt-specific problemsreferences/cost_optimization.md— Right-sizing, Spot/RI strategies, storage optimization, cost tagging, multi-cloud
CI/CD Workflows
Ready-to-use templates in assets/workflows/:
| Template | Platform | Features |
|---|---|---|
github-actions-terraform.yml |
GitHub Actions | Validation, TFLint, Checkov, plan on PRs, apply on main, OIDC |
github-actions-terragrunt.yml |
GitHub Actions | Changed module detection, parallel planning, dependency-aware apply |
gitlab-ci-terraform.yml |
GitLab CI | Multi-stage pipeline, artifact management, manual gates |
Scripts
| Script | Purpose | Usage |
|---|---|---|
init_module.py |
Scaffold new module with standard structure | python3 scripts/init_module.py <name> [--path ./modules] [--json] |
validate_module.py |
Validate module against best practices | python3 scripts/validate_module.py <path> |
Assets
templates/MODULE_TEMPLATE.md— Complete module template with file structure, examples, and Terragrunt configs
Quick Reference
Essential Commands
# Initialize
terraform init
terraform init -upgrade # Update providers
# Validate
terraform validate
terraform fmt -recursive
# Plan
terraform plan
terraform plan -out=tfplan
# Apply
terraform apply
terraform apply tfplan
terraform apply -auto-approve # CI/CD only
# State
terraform state list
terraform state show <resource>
terraform state rm <resource>
terraform state mv <old> <new>
# Import
terraform import <resource_address> <resource_id>
# Destroy
terraform destroy
terraform destroy -target=<resource>
# Outputs
terraform output
terraform output <output_name>
Terragrunt Commands
# Single module
terragrunt init
terragrunt plan
terragrunt apply
# All modules
terragrunt run-all plan
terragrunt run-all apply
terragrunt run-all destroy
# With specific modules
terragrunt run-all apply --terragrunt-include-dir vpc --terragrunt-include-dir eks