ECS Deployment Strategies

Complete guide to deploying ECS services safely and efficiently, from rolling updates to blue-green deployments.

Quick Reference

Strategy	Downtime	Rollback Speed	Complexity	Best For
Rolling Update	Zero	Medium	Low	Most workloads
Blue-Green	Zero	Instant	High	Critical services
Canary	Zero	Fast	High	Risk mitigation

Rolling Updates (Default)

Configuration

resource "aws_ecs_service" "app" {
  deployment_configuration {
    maximum_percent         = 200  # Allow 2x during deployment
    minimum_healthy_percent = 100  # Keep 100% healthy
  }

  deployment_circuit_breaker {
    enable   = true   # Auto-detect failures
    rollback = true   # Auto-rollback on failure
  }
}

Behavior

New task definition registered
New tasks launched (up to maximum_percent)
Health checks pass on new tasks
Old tasks drained and stopped
Continues until all tasks updated

Boto3 Deployment

import boto3

ecs = boto3.client('ecs')

def deploy_rolling_update(cluster: str, service: str,
                          new_image: str, container_name: str):
    """Deploy new image via rolling update"""

    # 1. Get current task definition
    svc = ecs.describe_services(cluster=cluster, services=[service])
    current_task_def = svc['services'][0]['taskDefinition']

    # 2. Create new task definition revision
    task_def = ecs.describe_task_definition(taskDefinition=current_task_def)
    new_task_def = task_def['taskDefinition'].copy()

    # Remove response-only fields
    for field in ['taskDefinitionArn', 'revision', 'status',
                  'requiresAttributes', 'compatibilities',
                  'registeredAt', 'registeredBy']:
        new_task_def.pop(field, None)

    # Update image
    for container in new_task_def['containerDefinitions']:
        if container['name'] == container_name:
            container['image'] = new_image

    response = ecs.register_task_definition(**new_task_def)
    new_task_def_arn = response['taskDefinition']['taskDefinitionArn']

    # 3. Update service
    ecs.update_service(
        cluster=cluster,
        service=service,
        taskDefinition=new_task_def_arn,
        forceNewDeployment=True
    )

    print(f"Deploying {new_task_def_arn}")
    return new_task_def_arn

# Usage
deploy_rolling_update(
    cluster='production',
    service='api',
    new_image='123456789.dkr.ecr.us-east-1.amazonaws.com/api:v2.0',
    container_name='api'
)

Monitor Deployment

def wait_for_deployment(cluster: str, service: str, timeout: int = 600):
    """Wait for deployment to complete"""
    import time

    start = time.time()
    while time.time() - start < timeout:
        response = ecs.describe_services(cluster=cluster, services=[service])
        svc = response['services'][0]

        for deployment in svc['deployments']:
            print(f"Deployment {deployment['id'][:8]}: "
                  f"{deployment['rolloutState']} "
                  f"({deployment['runningCount']}/{deployment['desiredCount']})")

            if deployment['status'] == 'PRIMARY':
                if deployment['rolloutState'] == 'COMPLETED':
                    print("Deployment successful!")
                    return True
                elif deployment['rolloutState'] == 'FAILED':
                    print(f"Deployment failed: {deployment.get('rolloutStateReason')}")
                    return False

        time.sleep(15)

    print("Deployment timed out")
    return False

Blue-Green Deployments

Architecture

                    ┌─────────────┐
                    │    ALB      │
                    └──────┬──────┘
                           │
           ┌───────────────┴───────────────┐
           │                               │
    ┌──────▼──────┐                 ┌──────▼──────┐
    │ Target Group│                 │ Target Group│
    │    (Blue)   │                 │   (Green)   │
    └──────┬──────┘                 └──────┬──────┘
           │                               │
    ┌──────▼──────┐                 ┌──────▼──────┐
    │ ECS Service │                 │ ECS Service │
    │   (Blue)    │                 │   (Green)   │
    └─────────────┘                 └─────────────┘

Terraform with CodeDeploy

# Two target groups
resource "aws_lb_target_group" "blue" {
  name        = "app-blue"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = module.vpc.vpc_id
  target_type = "ip"

  health_check {
    path = "/health"
  }
}

resource "aws_lb_target_group" "green" {
  name        = "app-green"
  port        = 8080
  protocol    = "HTTP"
  vpc_id      = module.vpc.vpc_id
  target_type = "ip"

  health_check {
    path = "/health"
  }
}

# ALB with two listeners
resource "aws_lb_listener" "prod" {
  load_balancer_arn = aws_lb.app.arn
  port              = 443
  protocol          = "HTTPS"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.blue.arn
  }

  lifecycle {
    ignore_changes = [default_action]  # Managed by CodeDeploy
  }
}

resource "aws_lb_listener" "test" {
  load_balancer_arn = aws_lb.app.arn
  port              = 8443
  protocol          = "HTTPS"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.green.arn
  }

  lifecycle {
    ignore_changes = [default_action]
  }
}

# ECS Service with CodeDeploy
resource "aws_ecs_service" "app" {
  name            = "app"
  cluster         = module.ecs.cluster_id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = 3

  deployment_controller {
    type = "CODE_DEPLOY"
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.blue.arn
    container_name   = "app"
    container_port   = 8080
  }

  lifecycle {
    ignore_changes = [task_definition, load_balancer]
  }
}

# CodeDeploy Application
resource "aws_codedeploy_app" "app" {
  compute_platform = "ECS"
  name             = "app-deploy"
}

# CodeDeploy Deployment Group
resource "aws_codedeploy_deployment_group" "app" {
  app_name               = aws_codedeploy_app.app.name
  deployment_group_name  = "app-dg"
  deployment_config_name = "CodeDeployDefault.ECSAllAtOnce"
  service_role_arn       = aws_iam_role.codedeploy.arn

  auto_rollback_configuration {
    enabled = true
    events  = ["DEPLOYMENT_FAILURE", "DEPLOYMENT_STOP_ON_REQUEST"]
  }

  blue_green_deployment_config {
    deployment_ready_option {
      action_on_timeout = "CONTINUE_DEPLOYMENT"
    }

    terminate_blue_instances_on_deployment_success {
      action                           = "TERMINATE"
      termination_wait_time_in_minutes = 5
    }
  }

  deployment_style {
    deployment_option = "WITH_TRAFFIC_CONTROL"
    deployment_type   = "BLUE_GREEN"
  }

  ecs_service {
    cluster_name = module.ecs.cluster_name
    service_name = aws_ecs_service.app.name
  }

  load_balancer_info {
    target_group_pair_info {
      prod_traffic_route {
        listener_arns = [aws_lb_listener.prod.arn]
      }

      test_traffic_route {
        listener_arns = [aws_lb_listener.test.arn]
      }

      target_group {
        name = aws_lb_target_group.blue.name
      }

      target_group {
        name = aws_lb_target_group.green.name
      }
    }
  }
}

Trigger Blue-Green Deployment

import boto3
import json

codedeploy = boto3.client('codedeploy')

def deploy_blue_green(app_name: str, deployment_group: str,
                      task_definition_arn: str, container_name: str,
                      container_port: int):
    """Trigger blue-green deployment via CodeDeploy"""

    app_spec = {
        "version": "0.0",
        "Resources": [{
            "TargetService": {
                "Type": "AWS::ECS::Service",
                "Properties": {
                    "TaskDefinition": task_definition_arn,
                    "LoadBalancerInfo": {
                        "ContainerName": container_name,
                        "ContainerPort": container_port
                    }
                }
            }
        }]
    }

    response = codedeploy.create_deployment(
        applicationName=app_name,
        deploymentGroupName=deployment_group,
        revision={
            'revisionType': 'AppSpecContent',
            'appSpecContent': {
                'content': json.dumps(app_spec)
            }
        }
    )

    deployment_id = response['deploymentId']
    print(f"Started deployment: {deployment_id}")
    return deployment_id

# Usage
deploy_blue_green(
    app_name='app-deploy',
    deployment_group='app-dg',
    task_definition_arn='arn:aws:ecs:us-east-1:123456789:task-definition/app:5',
    container_name='app',
    container_port=8080
)

Canary Releases

ALB Weighted Routing

resource "aws_lb_listener_rule" "canary" {
  listener_arn = aws_lb_listener.prod.arn
  priority     = 100

  action {
    type = "forward"
    forward {
      target_group {
        arn    = aws_lb_target_group.stable.arn
        weight = 90
      }
      target_group {
        arn    = aws_lb_target_group.canary.arn
        weight = 10
      }
    }
  }

  condition {
    path_pattern {
      values = ["/*"]
    }
  }
}

Gradual Traffic Shift

def shift_traffic(listener_rule_arn: str, canary_weight: int):
    """Shift traffic percentage to canary"""
    elb = boto3.client('elbv2')

    stable_weight = 100 - canary_weight

    elb.modify_rule(
        RuleArn=listener_rule_arn,
        Actions=[{
            'Type': 'forward',
            'ForwardConfig': {
                'TargetGroups': [
                    {
                        'TargetGroupArn': stable_tg_arn,
                        'Weight': stable_weight
                    },
                    {
                        'TargetGroupArn': canary_tg_arn,
                        'Weight': canary_weight
                    }
                ]
            }
        }]
    )

    print(f"Traffic: {stable_weight}% stable, {canary_weight}% canary")

# Progressive rollout
shift_traffic(rule_arn, 10)   # 10% to canary
# Monitor metrics...
shift_traffic(rule_arn, 25)   # 25% to canary
# Monitor metrics...
shift_traffic(rule_arn, 50)   # 50% to canary
# Monitor metrics...
shift_traffic(rule_arn, 100)  # 100% to canary (promote)

Deployment Circuit Breaker

How It Works

ECS monitors deployment health
Detects repeated task failures
Automatically stops deployment
Optional: Rolls back to previous version

Configuration

resource "aws_ecs_service" "app" {
  deployment_circuit_breaker {
    enable   = true
    rollback = true  # Auto-rollback on failure
  }
}

Failure Detection

Circuit breaker triggers when:

Tasks fail to reach RUNNING state
Health checks fail repeatedly
Tasks crash shortly after starting

GitOps Workflow

GitHub Actions Example

name: Deploy to ECS

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build and push image
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build -t $ECR_REGISTRY/myapp:$IMAGE_TAG .
          docker push $ECR_REGISTRY/myapp:$IMAGE_TAG

      - name: Update task definition
        id: task-def
        uses: aws-actions/amazon-ecs-render-task-definition@v1
        with:
          task-definition: task-definition.json
          container-name: myapp
          image: ${{ steps.login-ecr.outputs.registry }}/myapp:${{ github.sha }}

      - name: Deploy to ECS
        uses: aws-actions/amazon-ecs-deploy-task-definition@v2
        with:
          task-definition: ${{ steps.task-def.outputs.task-definition }}
          service: myapp-service
          cluster: production
          wait-for-service-stability: true

Rollback Strategies

Manual Rollback

def rollback_to_previous(cluster: str, service: str):
    """Rollback to previous task definition"""

    # Get current task definition
    svc = ecs.describe_services(cluster=cluster, services=[service])
    current_td = svc['services'][0]['taskDefinition']

    # Parse family and revision
    # arn:aws:ecs:region:account:task-definition/family:revision
    parts = current_td.split('/')[-1].split(':')
    family = parts[0]
    current_revision = int(parts[1])

    # Go back to previous revision
    previous_td = f"{family}:{current_revision - 1}"

    # Update service
    ecs.update_service(
        cluster=cluster,
        service=service,
        taskDefinition=previous_td
    )

    print(f"Rolling back to {previous_td}")

# Usage
rollback_to_previous('production', 'api')

Automatic Rollback (Circuit Breaker)

Enabled via deployment_circuit_breaker.rollback = true

Best Practices

Always enable circuit breaker with rollback for production
Use blue-green for critical services requiring instant rollback
Implement health checks at container, task, and ALB levels
Pin image digests instead of tags for reproducibility
Use immutable image tags in ECR
Monitor deployments with CloudWatch alarms
Test rollback procedures regularly
Keep previous task definitions for quick rollback

Progressive Disclosure

Quick Start (This File)

Rolling updates
Blue-green basics
Canary releases
Circuit breaker

Detailed References

Blue-Green Setup: Complete CodeDeploy configuration
CI/CD Pipelines: GitHub Actions, CodePipeline
Monitoring: CloudWatch, alarms

Related Skills

boto3-ecs: SDK patterns
terraform-ecs: Infrastructure as Code
ecs-troubleshooting: Debugging deployments

ecs-deployment

ECS Deployment Strategies

Quick Reference

Rolling Updates (Default)

Configuration

Behavior

Boto3 Deployment

Monitor Deployment

Blue-Green Deployments

Architecture

Terraform with CodeDeploy

Trigger Blue-Green Deployment

Canary Releases

ALB Weighted Routing

Gradual Traffic Shift

Deployment Circuit Breaker

How It Works

Configuration

Failure Detection

GitOps Workflow

GitHub Actions Example

Rollback Strategies

Manual Rollback

Automatic Rollback (Circuit Breaker)

Best Practices

Progressive Disclosure

Quick Start (This File)

Detailed References

Related Skills