Section 8 - High Availability & Scalability: ELB & ASGV

Scalability & High Availability
- Scalability means that an application/system can handle greater loads by adapting
- High availability: run instance for the same application across multi AZ
  - High availability usually goes hand in hand with horizontal scaling
  - Means running your application/system in at least 2 AZs
  - The goal is to survive a data center loss
  - Can be passive (for RDS Multi AZ), active (for horizontal scaling)
- 2 types:
  - Vertical: increasing the size of the instance
  - Common for non distributed systems, such as a database
  - RDS, ElastiCache are services that can scale vertically
  - There's usually a hardware limit to how much you can vertically scale
  - Ex: r4.large ‒> r4.4xlarge
  - Horizontal: increasing the number of instances/systems for your application
  - Auto Scaling Group, Load Balancer
  - Implies distributed systems
  - Common for web applications/modern applications
  - easy to horizontally scale thanks to EC2
Elastic Load Balancing
- Load Balancers ‒> servers that forward traffic to multiple servers (Ex: EC2 Instances) downstream
- Spread load across multiple downstream instances
- Expose singly point of access (DNS)
- Seamlessly handle failures of downstream instances
- Do Regular health checks to instances
  - enable load balancer to know if instances it forwards traffic to are available to reply to requests
  - done on port and route (/health is common)
  - 200 (Ok)
- Provide SSL termination (HTTPS) for websites
- Enforce stickiness with cookies
- High availability across zones
- Separate public traffic from private traffic
- Elastic Load Balancer ‒> managed load balancer
- AWS guarantees that it will be working
- AWS takes care of upgrades, maintenance, high availability
- AWS provides only a few configuration knobs
- Costs less to setup your own but more effort
- integrated with many AWS offerings
  - EC2, EC2 Auto Scaling, ECS, ….
- 4 Types:
  - Classic Load Balancer (v1) - CLB: HTTP, HTTPS, TCP, SSL
  - Supports TCP (Layer 4), HTTP & HTTPS (Layer 7)
  - Health checks are TCP or HTTP based
  - Fixed hostname: XXX.region.elb.amazonaws.com
  - Application Load Balancer (v2) - ALB: HTTP, HTTPS, WebSocket
  - Layer 7 (HTTP)
  - Load balancing to multiple HTTP applications across machines (target groups)
  - Load balancing to multiple applications on the same machine (ex: containers)
  - Support for HTTP/2 and WebSocket
  - Support redirects (from HTTP to HTTPS for example)
  - Routing tables to different target groups:
    - Routing based on path in URL
    - Routing based on hostname in URL
    - Routing based on Query String, Headers
  - ALB are a great fit for micro services & container-based application (example: Docker & Amazon ECS)
  - Has a port mapping feature to redirect to a dynamic port in ECS
  - In comparison, we'd need multiple Classic Load Balancer per application
  - Target Groups:
    - EC2 Instances (can be managed by an Auto Scaling Group) - HTTP
    - ECS task (managed by ECS itself) - HTTP
    - Lambda functions - HTTP request is translated into a JSON event
    - IP Addresses - must be private IPs
  - ALB can route to a multiple target groups
  - Health checks are at the target group level
  - Good to Know:
    - Fixed hostname (XXX.region.elb.amazonaws.com)
    - The application servers don't see the IP of the client directly
    - The true IP of the client is inserted in the header X-Forwarded-For
    - We can also get Port (X-Forwarded-Port) and proto (X-Forwarded-Proto)
  - Network Load Balancer (v2) - NLB: TCP, TLS (secure TCP), UPD
  - Gateway Load Balancer - GWLB: Operates at layer 3 (Network layer) - IP Protocol
  - Recommended to use newer gens
  - Some can be setup as internal or external ELBs