Terraform Best Practices
Terraform Best Practices
This skill provides guidance for writing reliable, secure Terraform configurations in RedC deployment scenarios. The recommendations here come from real-world experience with multi-cloud red team infrastructure — where a misconfigured state file or leaked credential can compromise an entire operation.
State Management
Use remote state because local state files are a single point of failure. If the file gets deleted, corrupted, or conflicts with another operator's changes, you lose track of what's deployed — which in red team scenarios means orphaned infrastructure you're still paying for.
- Store state in cloud-native backends (S3+DynamoDB for AWS, OSS for Alibaba, COS for Tencent)
- Enable state locking to prevent two operators from applying simultaneously
- Keep
.tfstateout of version control — it often contains sensitive outputs like IP addresses and credentials - Separate state per environment (dev/staging/prod) so a bad apply in dev doesn't corrupt prod state
Example — remote backend configuration:
terraform {
backend "s3" {
bucket = "myteam-tfstate"
key = "prod/infra.tfstate"
region = "us-east-1"
dynamodb_table = "tf-lock"
encrypt = true
}
}
Module Design
Modules keep configurations DRY and composable. A well-designed module encapsulates one concern (e.g., "a VPC with public/private subnets") so you can reuse it across deployments without copy-pasting.
- Keep each module focused on a single responsibility — a "vpc" module shouldn't also create EC2 instances
- Use
validationblocks on variables to catch bad input before plan time - Output the values downstream resources need (IDs, IPs, endpoints, DNS names)
- Pin module versions in production (
source = "git::...?ref=v1.2.0") to avoid surprise changes
Example — variable validation:
variable "instance_type" {
type = string
description = "EC2 instance type for the deployment"
default = "t3.small"
validation {
condition = can(regex("^(t3|t4g|m5|c5)\\.", var.instance_type))
error_message = "Use t3, t4g, m5, or c5 family instances for red team workloads."
}
}
Security
Red team infrastructure is high-value target material. A leaked credential or overly permissive security group doesn't just cost money — it can expose your operation.
- Never hardcode credentials in
.tffiles. Use environment variables (TF_VAR_*), a secrets manager, or RedC's profile-based credential injection - Restrict security group ingress to the CIDRs you actually need. Opening
0.0.0.0/0on port 22 invites scanners within minutes - Enable encryption at rest for all storage resources (EBS, S3, disks) — it's usually free and prevents data exposure on decommission
- Prefer IAM roles attached to instances over embedding access keys, because roles auto-rotate and don't appear in state files
- Tag every resource with
owner,purpose, andexpiresso forgotten infrastructure can be identified and cleaned up
Variables and Locals
Good variable hygiene makes templates reusable and self-documenting.
- Set sensible defaults for optional variables so users can deploy with minimal config
- Write a
descriptionfor every variable — RedC's template system surfaces these to users - Mark sensitive variables with
sensitive = trueto prevent their values from appearing in logs and plan output - Use
localsfor computed values to avoid repeating expressions across resources
Example — locals for computed naming:
locals {
name_prefix = "${var.project}-${var.environment}"
common_tags = {
Project = var.project
Environment = var.environment
ManagedBy = "RedC"
}
}
Common Pitfalls
These are the mistakes that come up most often in RedC deployments:
| Symptom | Likely Cause | Fix |
|---|---|---|
| "Could not load plugin" after adding a provider | Forgot terraform init |
Run terraform init -upgrade |
| Resources created in wrong order | Missing implicit dependency | Add explicit depends_on |
| Plan shows unexpected changes | State drift from manual console changes | Run terraform refresh then review |
| Sensitive value visible in output | Forgot sensitive = true |
Add it to the variable and output blocks |
-target left partial state |
Targeted apply skipped dependent resources | Apply the full plan without -target to reconcile |
More from wgpsec/redc-template
multi-cloud deployment
Guide for deploying infrastructure across multiple cloud providers (AWS, Azure, GCP, Alibaba Cloud, Tencent Cloud, Huawei Cloud, Volcengine). Use this skill whenever the user mentions deploying to more than one cloud, comparing cloud providers, selecting regions, configuring provider credentials, or asking about cross-cloud compatibility. Also use when the user asks about a specific Chinese cloud provider (Alibaba, Tencent, Huawei, Volcengine) since these have unique authentication patterns that differ from Western clouds.
1terraform-provider-docs
Look up official Terraform provider documentation before writing or debugging any Terraform resource, data source, or provider configuration. Use this skill whenever you encounter a Terraform error, need to write a new resource block, are unsure about argument syntax or valid values, need to check resource attribute constraints, or want to understand provider-specific behaviors. Consult the docs first instead of guessing Terraform arguments from memory — it consistently saves multiple debug cycles.
1aws security hardening
AWS security hardening guide for red team infrastructure. Use this skill whenever the user is deploying to AWS, configuring IAM policies, setting up VPCs or security groups, asking about SSH access, encryption, key rotation, or any AWS security question. Also apply when the user mentions EC2 instances, EBS volumes, S3 buckets, or AWS networking — even if they don't explicitly ask about "security", because every AWS deployment should follow these hardening practices by default.
1cloud cost optimization
Strategies for minimizing cloud infrastructure costs in red team deployments. Use this skill whenever the user asks about pricing, budgets, cost estimates, instance sizing, spot instances, or resource cleanup. Also apply when the user is choosing instance types, discussing how long to keep infrastructure running, asking about billing alerts, or planning a deployment where cost is a concern — even if they don't explicitly mention "cost" or "budget". Proactively reference this skill when generating templates to suggest cost-saving alternatives.
1deployment troubleshooting
Diagnose and fix Terraform deployment errors in RedC scenarios. Use this skill whenever the user encounters an error during deployment — whether it's a Terraform init failure, authentication error, resource creation failure, network timeout, state conflict, or cloud-init problem. Also use when the user pastes an error message, says "deployment failed", asks why something isn't working, or reports that instances are unreachable after creation. This skill covers the most common failure modes across all cloud providers supported by RedC.
1