Wednesday, November 19, 2025

CI/CD - Canary Deployment

Below is a clear and practical explanation of how traffic is gradually shifted (“canaried”) in three different environments:


One liner concept explanation:
  1. EC2 based environment: 2 ASGs (old and new) registered in different weighted target groups of ALB
  2. ECS (EC2/Farget) based environment: 2 Services (old and new) registered in different weighted target groups of ALB
  3. Lambda based environment: Weighted Alias ( 2 versions of lambda registered in Alias with different weights) 

1. EC2-Based Environment (Classic servers / Auto Scaling Groups)

How Canary Traffic Shifting Works

In EC2-based deployments, canary rollout is typically done at the Load Balancer + Auto Scaling Group (ASG) level.

Step-by-Step Traffic Shift

  1. Two ASGs are created

    • ASG1 → Stable version (v1)
    • ASG2 → Canary version (v2)
  2. Both ASGs are registered under the separate ALB Target Groups.

  3. Weighted Target Groups (ALB Feature)
    Example:

    • v1 TG weight = 90
    • v2 TG weight = 10
  4. Gradual shift is done by modifying weights

    • 90/10 → initial canary
    • 70/30 → monitor logs/errors
    • 30/70
    • 0/100 → success
  5. If issues appear:

    • instantly push traffic back to v1 by changing weights to 100/0.

Tools commonly used

  • AWS ALB Weighted Target Groups
  • AWS CodeDeploy (Canary Deployment for EC2/On-Prem)
  • ASG + Launch Templates


2. AWS Lambda-Based Environment

Canary deployments are native to Lambda—no infrastructure needed.

How Traffic Shifting Works

Lambda uses Aliases and Versions.

Example:

  • Version 1 → Stable
  • Version 2 → New release
  • Alias “LIVE” points to version 1 initially



Traffic Shift Steps

  1. Deploy new Lambda → version 2.

  2. Set alias traffic weights:

    LIVE alias: v1 = 90% v2 = 10%
  3. Let the system run and monitor logs (CloudWatch).

  4. Shift more traffic:

    • v1=70% / v2=30%
    • v1=30% / v2=70%
  5. Final cutover:

    • v1=0% / v2=100%

Rollback

Just update the alias back to:

  • v1=100%, v2=0%

Tools

  • AWS Lambda Aliases + Versions
  • AWS SAM / CDK for deployment
  • AWS CodeDeploy (Lambda Canary support)


3. Container-Based Environment (ECS / EKS / Kubernetes)

Traffic shifting is done at the service mesh / ingress layer.


3A. ECS (EC2 or Fargate)

Traffic Shift Method

  • Use two ECS services:

    • service-v1 (old)
    • service-v2 (new)
  • Attach services to two ALB Target Groups.

  • ALB Weighted Target Groups control traffic:

    • 90/10 → 70/30 → 50/50 → 0/100

Tools

  • ALB Weighted Target Groups
  • ECS Deployment Controller (Blue/Green via CodeDeploy)


3B. EKS / Kubernetes (most common)

How Canary Traffic Shift Works

Canary is controlled by:

  • Ingress Controller (NGINX / ALB Ingress)
  • Service Mesh (Istio / Linkerd / Consul)

Traffic Shift Example (Istio VirtualService)

spec: http: - route: - destination: host: myapp subset: v1 weight: 90 - destination: host: myapp subset: v2 weight: 10

Shift pattern

  • 90/10 → Initial Canary
  • 70/30 → Monitoring metrics
  • 30/70
  • 0/100 → Rollout complete

Rollback

Instantly revert weights to:

  • v1=100, v2=0

Tools

  • Istio / Linkerd / Consul Mesh
  • Argo Rollouts (best tool for progressive delivery)
  • Kubernetes Ingress with canary annotations


🎯 Summary Table

EnvironmentMechanismTraffic Shift MethodTools
EC2ASGs + ALBWeighted Target groups (10% → 100%)ALB, CodeDeploy
LambdaVersions + AliasAlias weight routingCodeDeploy, SAM, CDK
Containers (ECS)Services + ALBWeighted target groupsALB, ECS B/G deploy
EKS / K8sIngress / MeshWeighted routing rulesIstio, Argo Rollouts, NGINX Ingress

No comments:

Post a Comment

Node | Cluster Vs Worker Threads

Cluster: Multiple processes (scale app across CPU cores) Worker Threads: Multiple threads (handle CPU-heavy work inside one process) Cluster...