eks-cluster

Installation

SKILL.md

EKS Cluster + Node Group + EC2 Setup

Provision EKS clusters with managed node groups for running services on AWS.

Prerequisites

aws --version        # AWS CLI v2
eksctl version       # eksctl 0.175+
kubectl version      # kubectl

Ensure your AWS credentials have permissions for EKS, EC2, VPC, IAM, and CloudFormation.

Quick Start

Create a production-ready cluster with a single command:

scripts/create-eks-cluster.sh \
  --name my-cluster \
  --region us-east-1 \
  --version 1.31 \
  --node-type m6i.xlarge \
  --nodes 3 \
  --nodes-min 2 \
  --nodes-max 8

This creates:

VPC with public + private subnets across 3 AZs
EKS cluster with specified Kubernetes version
Managed node group with autoscaling
Essential addons (VPC CNI, CoreDNS, kube-proxy, EBS CSI, Pod Identity Agent)
NGINX Ingress Controller

Workflow

Step 1: Create EKS Cluster

For a new cluster with default VPC:

scripts/create-eks-cluster.sh \
  --name <CLUSTER_NAME> \
  --region <REGION> \
  --version <K8S_VERSION>

Options:

--version — Kubernetes version (default: 1.31)
--vpc-cidr — VPC CIDR block (default: 10.0.0.0/16)
--node-type — EC2 instance type (default: m6i.xlarge)
--nodes — Desired node count (default: 3)
--nodes-min — Minimum nodes for autoscaling (default: 2)
--nodes-max — Maximum nodes for autoscaling (default: 8)
--node-volume-size — EBS volume size in GB (default: 80)
--ssh-key — EC2 key pair name for SSH access
--existing-vpc — Use an existing VPC ID instead of creating a new one
--private-subnets — Comma-separated private subnet IDs (with --existing-vpc)
--public-subnets — Comma-separated public subnet IDs (with --existing-vpc)
--spot — Use Spot instances for the node group
--dry-run — Generate eksctl config without creating resources

For an eksctl config file approach (more control):

scripts/create-eks-cluster.sh --name my-cluster --config-only
# Generates eksctl-config.yaml — edit it, then:
eksctl create cluster -f eksctl-config.yaml

Step 2: Add Node Groups

Add specialized node groups for different workload types:

# General workloads
scripts/create-node-group.sh \
  --cluster my-cluster \
  --name general \
  --type m6i.xlarge \
  --nodes 3

# GPU / ML workloads
scripts/create-node-group.sh \
  --cluster my-cluster \
  --name gpu \
  --type g5.xlarge \
  --nodes 1 \
  --nodes-max 4 \
  --taint "nvidia.com/gpu=true:NoSchedule" \
  --label workload-type=gpu

# Spot instances for batch / non-critical workloads
scripts/create-node-group.sh \
  --cluster my-cluster \
  --name spot-workers \
  --type m6i.xlarge,m5.xlarge,m5a.xlarge \
  --spot \
  --nodes 2 \
  --nodes-max 10 \
  --label workload-type=spot

Options:

--type — Instance type(s), comma-separated for mixed (default: m6i.xlarge)
--nodes / --nodes-min / --nodes-max — Autoscaling range
--spot — Use Spot instances
--taint — Apply taint (key=value:effect)
--label — Apply label (key=value), repeatable
--volume-size — EBS volume in GB (default: 80)
--ami-family — AMI family: AmazonLinux2023, Bottlerocket (default: AmazonLinux2023)
--ssh-key — EC2 key pair for SSH access

Step 3: Install Addons

The cluster script installs essential addons automatically. To manage them separately:

# Pod Identity Agent (required for aws-s3-eks skill)
eksctl create addon --cluster my-cluster --name eks-pod-identity-agent

# EBS CSI Driver (for PersistentVolumeClaims)
eksctl create addon --cluster my-cluster --name aws-ebs-csi-driver \
  --service-account-role-arn arn:aws:iam::ACCOUNT:role/ebs-csi-role

# EFS CSI Driver (for ReadWriteMany PVCs)
eksctl create addon --cluster my-cluster --name aws-efs-csi-driver

# AWS Load Balancer Controller (alternative to NGINX Ingress)
eksctl create addon --cluster my-cluster --name aws-load-balancer-controller

See references/addons.md for full addon details and IAM role setup.

Step 4: Deploy Your Services

After the cluster is ready, deploy your services:

# Create a namespace
kubectl create namespace my-app

# Deploy via manifests
kubectl apply -f k8s/

# Deploy via Helm
helm upgrade --install my-app ./helm/ -n my-app --create-namespace

Step 5: Verify

# Cluster health
kubectl get nodes -o wide
kubectl get pods -A

# Service endpoints
kubectl get ingress -A
kubectl get svc -A

Workload → Instance Type Mapping

Workload Profile	Recommended Instance	Min Nodes
Low CPU, low memory (auth, proxies)	t3.medium or m6i.large	1
Medium CPU, medium memory (APIs, agents)	m6i.xlarge	2
High CPU, medium memory (request routing)	c6i.xlarge or m6i.xlarge	2
Memory-optimized (databases, caches)	r6i.xlarge	2
GPU inference	g5.xlarge / g5.2xlarge	1

See references/ec2-instance-types.md for detailed sizing guidance.

Reference Files

eksctl configs: See references/eksctl-configs.md for full ClusterConfig YAML templates (dev, staging, production)
Instance types: See references/ec2-instance-types.md for EC2 sizing by workload with pricing
Addons: See references/addons.md for EKS addon installation, IAM roles, and configuration

Troubleshooting

Symptom	Cause	Fix
`eksctl create cluster` hangs	CloudFormation stack stuck	Check AWS Console > CloudFormation for failed events
Nodes `NotReady`	VPC CNI or kubelet issue	`kubectl describe node <name>` — check conditions
Pods `Pending`	No capacity / taint mismatch	`kubectl describe pod <name>` — check Events
`CreateContainerConfigError`	Missing ConfigMap/Secret	Verify configmaps and secrets exist in the namespace
`ImagePullBackOff`	ECR auth or image not found	Check `aws ecr get-login-password` and image URI
Node group create fails	Instance type unavailable in AZ	Try different AZs or instance types
`InsufficientFreeAddressesInSubnet`	VPC subnet exhausted	Check subnet CIDR size or use `--vpc-cidr` with larger range

Related skills

More from deepparser/skills

Installs

Repository

deepparser/skills

First Seen

Mar 10, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykWarn