EKS Security

Overview

Comprehensive security hardening guide for Amazon EKS clusters following 2025 best practices. This skill covers control plane security, workload isolation, secrets management, network policies, image scanning, runtime security, and compliance frameworks.

Keywords: EKS security, cluster hardening, IRSA, Pod Security Standards, network policies, secrets management, compliance, vulnerability scanning, runtime security, incident response

Status: Production-ready (2025 best practices)

When to Use This Skill

Hardening new EKS clusters for production
Implementing security controls and policies
Configuring RBAC and IAM access
Setting up secrets management
Preparing for compliance audits (CIS, NIST, SOC2)
Responding to security incidents
Scanning and remediating vulnerabilities
Implementing zero-trust networking
Setting up runtime security monitoring

Security Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                     EKS Security Layers                     │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Layer 1: Control Plane Security                           │
│  • Private API endpoint                                     │
│  • Audit logging enabled                                    │
│  • Secrets encryption with KMS                             │
│  • IP allowlisting                                          │
│                                                             │
│  Layer 2: Authentication & Authorization                    │
│  • IAM Roles for Service Accounts (IRSA)                   │
│  • RBAC with least privilege                               │
│  • Pod Identity for workloads                              │
│  • Service account isolation                               │
│                                                             │
│  Layer 3: Workload Security                                │
│  • Pod Security Standards (restricted)                      │
│  • Security contexts                                        │
│  • Read-only root filesystems                              │
│  • Non-root users                                           │
│  • Resource limits                                          │
│                                                             │
│  Layer 4: Network Security                                  │
│  • Network Policies (VPC CNI 1.14+)                        │
│  • Security Groups for Pods                                │
│  • Private subnets for nodes                               │
│  • VPC Flow Logs                                            │
│  • mTLS with service mesh                                  │
│                                                             │
│  Layer 5: Secrets & Data Protection                        │
│  • External Secrets Operator                               │
│  • AWS Secrets Manager integration                         │
│  • Encrypted etcd                                           │
│  • Automatic rotation                                       │
│                                                             │
│  Layer 6: Image & Runtime Security                         │
│  • Amazon Inspector scanning                               │
│  • Admission controllers (OPA/Gatekeeper)                  │
│  • Runtime monitoring (Falco, GuardDuty)                   │
│  • Image signing/verification                              │
│                                                             │
│  Layer 7: Compliance & Audit                               │
│  • CloudTrail logging                                       │
│  • GuardDuty for EKS                                       │
│  • Security Hub integration                                 │
│  • CIS/NIST compliance checks                              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Quick Start: Essential Security Checklist

Cluster Level (Day 0)

Enable private API endpoint (or public + IP allowlist)
Enable all control plane logging
Configure secrets encryption with KMS
Use latest Kubernetes version (within 2 versions)
Enable audit logging
Configure VPC with private subnets

Node Level (Day 0)

Use Amazon Linux 2023 or Bottlerocket AMI
Enable IMDSv2 enforcement
Minimal IAM permissions for node role
Deploy nodes in private subnets only
Enable SSM for remote access (disable SSH)
Plan for regular node rotation (21 days max)

Workload Level (Day 1)

Implement Pod Security Standards (restricted level)
Use IRSA for all AWS service access
No privileged containers
Configure security contexts (runAsNonRoot, readOnlyRootFilesystem)
Set resource limits and requests
Use dedicated service accounts per application

Network Level (Day 1)

Enable network policies (VPC CNI 1.14+ or Calico/Cilium)
Configure default deny-all policies
Use Security Groups for Pods for AWS resource access
Enable VPC Flow Logs
Restrict egress traffic
Deploy private load balancers where possible

Secrets Management (Day 1)

Deploy External Secrets Operator
Migrate secrets to AWS Secrets Manager
Enable automatic secret rotation
Remove hardcoded credentials
Audit secret access via CloudTrail

Image Security (Day 1-2)

Enable Amazon Inspector for ECR repositories
Configure automatic scanning on push
Block deployment of critical vulnerabilities
Implement image signing (Sigstore/Notary)
Use minimal base images (distroless, Chainguard)
ECR lifecycle policies for old images

Compliance & Monitoring (Day 2-3)

Enable GuardDuty for EKS
Configure Security Hub
Run kube-bench for CIS compliance
Deploy runtime security (Falco)
Set up CloudWatch alarms for security events
Configure SIEM integration

Security Workflow

Phase 1: Foundation (Control Plane)

Review control plane endpoint configuration
Enable comprehensive logging
Configure KMS encryption for secrets
Set up IAM authentication
Implement API access controls

See: references/cluster-security.md

Phase 2: Workload Hardening

Implement Pod Security Standards
Configure security contexts
Set up IRSA for service accounts
Deploy RBAC policies
Enable admission controllers

See: references/workload-security.md

Phase 3: Secrets & Data Protection

Deploy External Secrets Operator
Migrate secrets to AWS Secrets Manager
Configure automatic rotation
Set up CSI Secrets Store Driver (if needed)
Audit and monitor secret access

See: references/secrets-management.md

Phase 4: Network Security

Enable network policies
Configure default deny rules
Set up Security Groups for Pods
Implement service mesh with mTLS (optional)
Enable VPC Flow Logs

Phase 5: Runtime Security

Deploy image scanning
Configure admission controllers
Set up runtime monitoring (Falco)
Enable GuardDuty for EKS
Configure incident response

Phase 6: Compliance & Audit

Run security benchmarks
Configure continuous compliance
Set up security dashboards
Document security controls
Conduct regular security reviews

Critical Security Controls

1. IAM Roles for Service Accounts (IRSA)

Why: Provides pod-level AWS permissions without node-level credentials

Quick Implementation:

# Service Account with IRSA
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-app-sa
  namespace: production
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/my-app-role

Best Practices:

One service account per application
Explicit trust policies with namespace and SA name
Least privilege IAM policies
Regular audit of role usage

Details: references/cluster-security.md#irsa

2. Pod Security Standards

Why: Prevents privilege escalation and enforces security best practices

Quick Implementation:

# Restricted namespace
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Levels:

privileged: No restrictions (avoid in production)
baseline: Minimal restrictions (development)
restricted: Comprehensive restrictions (REQUIRED for production)

Details: references/workload-security.md#pod-security-standards

3. Network Policies

Why: Implement microsegmentation and zero-trust networking

Quick Implementation:

# Default deny all traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Capabilities:

Pod-to-pod traffic control
Namespace isolation
External service access control
Defense in depth with Security Groups for Pods

Details: references/workload-security.md#network-policies

4. External Secrets Operator

Why: Centralized secret management with automatic rotation

Quick Implementation:

# ExternalSecret resource
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: app-secrets
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: SecretStore
  target:
    name: app-secrets-k8s
  data:
  - secretKey: db-password
    remoteRef:
      key: prod/db/password

Benefits:

Automatic synchronization
Works with Fargate (unlike CSI driver)
Multiple backend support
Audit trail via CloudTrail

Details: references/secrets-management.md#external-secrets-operator

5. Container Image Scanning

Why: Identify and remediate vulnerabilities before deployment

Amazon Inspector 2025 Features:

Container image mapping (shows running containers)
Extended coverage (distroless, scratch, Chainguard)
Continuous monitoring with automatic rescans
Prioritization based on actively running images

Quick Setup:

# Enable enhanced scanning
aws ecr put-registry-scanning-configuration \
  --scan-type ENHANCED \
  --rules '[{"repositoryFilters":[{"filter":"*","filterType":"WILDCARD"}],"scanFrequency":"CONTINUOUS_SCAN"}]'

Details: references/workload-security.md#image-scanning

6. Runtime Security Monitoring

Why: Detect and respond to threats in real-time

Tools:

Amazon GuardDuty for EKS: Managed threat detection
Falco: Open-source runtime security
OPA/Gatekeeper: Policy enforcement

GuardDuty for EKS Capabilities:

Suspicious API calls
Privilege escalation attempts
Cryptocurrency mining detection
Anomalous network activity
Container escape attempts

Details: references/workload-security.md#runtime-security

Security Patterns

Pattern 1: Zero-Trust Cluster (Maximum Security)

Configuration:

Private API endpoint only
All nodes in private subnets
Network policies enabled (default deny)
Pod Security Standards (restricted)
mTLS service mesh (Istio)
No public load balancers
VPN/Direct Connect for access

Use Case: Healthcare, finance, regulated industries

Pattern 2: Defense-in-Depth Production Cluster

Configuration:

Public + private API endpoints
IP allowlist on public endpoint
Nodes in private subnets
Network policies + Security Groups for Pods
Pod Security Standards (restricted)
External Secrets Operator
GuardDuty + Inspector enabled

Use Case: Standard production workloads

Pattern 3: Multi-Tenant Cluster

Configuration:

Namespace isolation with RBAC
ResourceQuotas per namespace
Network policies per namespace
Dedicated node groups with taints/tolerations
Pod Security Standards per namespace
Audit logging of all API calls

Use Case: Platform teams, SaaS applications

Compliance Frameworks

CIS Kubernetes Benchmark

Tool: kube-bench

# Run CIS benchmark
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job-eks.yaml

# View results
kubectl logs -n kube-bench job/kube-bench

Key Controls:

Control plane security
Worker node configuration
Pod security policies
Network policies
Authentication/authorization

NIST 800-190 (Container Security)

Five Areas:

Image security and integrity
Registry security
Orchestrator security
Container runtime security
Host OS security

Implementation: See detailed mapping in references/cluster-security.md#compliance

SOC2 / HIPAA / PCI-DSS

Common Requirements:

Encryption at rest and in transit
Audit logging and monitoring
Access controls (RBAC + IRSA)
Vulnerability scanning
Incident response procedures
Regular security assessments

Security Incident Response

Detection Sources

GuardDuty alerts
CloudWatch alarms
Falco runtime alerts
Audit log anomalies
Network traffic analysis
Container image scan results

Response Workflow

Identify: Determine scope and severity
Contain: Isolate affected workloads
Eradicate: Remove malicious components
Recover: Restore from known-good state
Review: Post-incident analysis

Common Scenarios

Compromised Pod:

# Immediate isolation
kubectl label pod <pod-name> security=isolated
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: isolate-compromised-pod
spec:
  podSelector:
    matchLabels:
      security: isolated
  policyTypes:
  - Ingress
  - Egress
EOF

# Collect forensics
kubectl logs <pod-name> > pod-logs.txt
kubectl exec <pod-name> -- ps aux > processes.txt

# Delete pod
kubectl delete pod <pod-name>

Details: references/workload-security.md#incident-response

Detailed Documentation

For comprehensive security configurations and advanced topics:

Cluster-Level Security: references/cluster-security.md
- Control plane hardening
- API endpoint configuration
- Audit logging setup
- IRSA detailed configuration
- RBAC best practices
- Compliance frameworks
Workload Security: references/workload-security.md
- Pod Security Standards implementation
- Security contexts configuration
- Network policies patterns
- Image scanning and verification
- Runtime security (Falco, GuardDuty)
- Admission controllers (OPA/Gatekeeper)
- Incident response procedures
Secrets Management: references/secrets-management.md
- External Secrets Operator setup
- AWS Secrets Manager integration
- CSI Secrets Store Driver
- Secret rotation strategies
- Encryption configuration
- Audit and monitoring

Security Assessment Tools

Scanning and Assessment

kube-bench: CIS Kubernetes benchmark
kube-hunter: Active vulnerability scanning
Polaris: Configuration validation
Trivy: Vulnerability and misconfiguration scanning
Checkov: IaC security scanning

Runtime Security

Falco: Runtime threat detection
GuardDuty for EKS: AWS-managed threat detection
Sysdig: Container security platform

Policy Enforcement

OPA/Gatekeeper: Policy as code
Kyverno: Kubernetes-native policy engine
Pod Security Admission: Built-in PSS enforcement

Common Security Anti-Patterns to Avoid

Anti-Pattern	Risk	Solution
Using default service accounts	Overly permissive	Create dedicated service accounts per app
Privileged containers	Host access, container escape	Use specific capabilities, PSS restricted
Hardcoded secrets in manifests	Credential exposure	Use External Secrets Operator
No network policies	Lateral movement	Implement default-deny policies
Running as root	Privilege escalation	Set runAsNonRoot: true
Public API endpoint without restrictions	Unauthorized access	Use private endpoint or IP allowlist
No image scanning	Vulnerability deployment	Enable Amazon Inspector
Shared node IAM roles	Excessive permissions	Use IRSA for pod-level permissions
No resource limits	Resource exhaustion	Set requests and limits
Missing audit logs	No forensic capability	Enable all control plane logs

Automated Security Hardening

Terraform Module (Recommended)

module "eks_security" {
  source = "./modules/eks-security"

  cluster_name = "production-cluster"

  # Control plane
  enable_private_endpoint = true
  enable_public_endpoint  = false
  enable_audit_logging    = true
  kms_key_arn            = aws_kms_key.eks.arn

  # Workload security
  pod_security_standard = "restricted"
  enable_network_policies = true

  # Secrets
  deploy_external_secrets = true
  secrets_manager_role_arn = aws_iam_role.secrets.arn

  # Monitoring
  enable_guardduty = true
  enable_inspector = true

  # Compliance
  cis_compliance_mode = true
}

See full examples: references/cluster-security.md#terraform

Security Roadmap

Week 1: Foundation

Review and harden control plane
Configure audit logging
Set up IRSA for critical workloads
Enable Pod Security Standards

Week 2: Network & Secrets

Deploy network policies
Implement External Secrets Operator
Configure Security Groups for Pods
Enable VPC Flow Logs

Week 3: Scanning & Runtime

Enable Amazon Inspector
Deploy GuardDuty for EKS
Configure Falco (optional)
Set up admission controllers

Week 4: Compliance & Operations

Quick Reference Commands

Security Audit

# Check Pod Security Standards
kubectl get namespaces -o custom-columns=NAME:.metadata.name,PSS:.metadata.labels.pod-security\.kubernetes\.io/enforce

# List service accounts with IRSA
kubectl get sa -A -o jsonpath='{range .items[?(@.metadata.annotations.eks\.amazonaws\.com/role-arn)]}{.metadata.namespace}{"\t"}{.metadata.name}{"\t"}{.metadata.annotations.eks\.amazonaws\.com/role-arn}{"\n"}{end}'

# Check for privileged pods
kubectl get pods -A -o jsonpath='{range .items[?(@.spec.containers[*].securityContext.privileged==true)]}{.metadata.namespace}{"\t"}{.metadata.name}{"\n"}{end}'

# List pods running as root
kubectl get pods -A -o jsonpath='{range .items[?(@.spec.securityContext.runAsNonRoot!=true)]}{.metadata.namespace}{"\t"}{.metadata.name}{"\n"}{end}'

# Check network policies
kubectl get networkpolicies -A

# View audit logs
aws logs tail /aws/eks/production-cluster/cluster --follow --filter-pattern '{ $.verb != "get" && $.verb != "list" && $.verb != "watch" }'

Security Monitoring

# GuardDuty findings
aws guardduty list-findings --detector-id <detector-id> --finding-criteria '{"Criterion":{"resource.resourceType":{"Eq":["EKS"]}}}'

# Inspector scan results
aws inspector2 list-findings --filter-criteria '{"ecrImageRepositoryName":[{"comparison":"EQUALS","value":"my-repo"}]}'

# CloudWatch Container Insights
aws cloudwatch get-metric-statistics \
  --namespace ContainerInsights \
  --metric-name pod_cpu_utilization \
  --dimensions Name=ClusterName,Value=production-cluster

Last Updated: November 2025 Kubernetes Version: 1.33 Security Standards: CIS Kubernetes Benchmark 1.8, NIST 800-190, AWS Well-Architected Status: Production-ready