AWS Cert Notes

AWS Certification Notes - Identity

AWS IAM Identity Center supports only SAML 2.0–based applications

AWS DataSync

Use AWS DataSync to automate and accelerate online data transfers to AWS storage services.
AWS DataSync is an online data transfer service that simplifies, automates, and accelerates copying large amounts of data to and from AWS storage services over the internet or AWS Direct Connect.
AWS DataSync fully automates and accelerates moving large active datasets to AWS, up to 10 times faster than command-line tools.
Natively integrated with Amazon S3, Amazon EFS, Amazon FSx for Windows File Server, Amazon CloudWatch, and AWS CloudTrail, which provides seamless and secure access to your storage services, as well as detailed monitoring of the transfer.
DataSync uses a purpose-built network protocol and scale-out architecture to transfer data.
A single DataSync agent is capable of saturating a 10 Gbps network link.
It comes with retry and network resiliency mechanisms, network optimizations, built-in task scheduling, monitoring via the DataSync API and Console, and CloudWatch metrics, events, and logs that provide granular visibility into the transfer process.
DataSync performs data integrity verification both during the transfer and at the end of the transfer.

Amazon Kinesis Data Firehose

It is the easiest way to reliably load streaming data into data lakes, data stores, and analytics tools.
It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration.
It can also batch, compress, transform, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security.
When a Kinesis data stream is configured as the source of a Firehose delivery stream, Firehose’s PutRecord and PutRecordBatch operations are disabled, and Kinesis Agent cannot write to Firehose delivery stream directly.
Data needs to be added to the Kinesis data stream through the Kinesis Data Streams PutRecord and PutRecords operations instead.
Firehose does not support Neptune DB as a destination.
Firehose cannot write directly to dynamo DB.
Firehose valid destinations are s3, redshift, opensearch, splunk, and any third party or custom HTTP endpoint
Firehouse is for loading data into datasources, it is not for other services to consume data from. Data Streams is for consumers of data.

Kinesis Client Library (KCL) and DynamoDB

Each KCL application must use its own DynamoDB table.
Reasons:
- Scan operations are used to obtain leases from a DynamoDB table. Therefore, if a table contains leases of different KCL applications, each application could receive a lease that isn't related to the application itself.
- Shard IDs in streams are used as primary keys in DynamoDB tables during checkpointing. When different KCL applications use the same DynamoDB table and the same shard IDs are used in the streams, inconsistencies in checkpoints can occur.
Users can only use DynamoDB as a checkpointing table for the KCL.
The KCL behavior and implementation are interconnected with DynamoDB in the following ways:
- The KCL includes ShardSyncTask.java, which guarantees that shard leases in a stream are included in the DynamoDB table. This check is conducted periodically in the KCL.
- The KCL includes DynamoDBLeaseTaker.java and DynamoDBLeaseRenewer.java, which are components that manage and update leases in the KCL. DynamoDBLeaseTaker.java and DynamoDBLeaseRenewer.java work with DynamoDBLeaseRefresher.java to make frequent API requests to DynamoDB.
- When the KCL makes checkpoints, requests from DynamoDBCheckpointer.java and DynamoDBLeaseCoordinator.java are made to DynamoDB.

VPC Flow Logs

Vpc flow logs cannot inspect network packets
VPC interface end points use private IP addresses from your VPC to access AWS Resources like S3. Endpoints use the S3 public IP and not billed.
VPC flow logs can only be analyzed in CloudWatch, not CloudTrail
VPC flow logs cannot determine if packet loss is occurring
VPC sharing is part of Resource Access Manager (RAM)
A network acl cannot be associated with an eni
Private Direct Connect VIFs cannot be used to create a VPN connection, Public VIFs must be used for that.
Private Direct Connect VIFs cannot be used to directly connect to any public AWS IPs (S3 ..), Public VIFs must be used for that.

Accessing Amazon S3 Using a Private Direct Connect Virtual Interface (VIF)

Establish a Direct Connect Connection:
- You'll need an AWS Direct Connect connection between your on-premises network and AWS.
Create a Private VIF:
- Within your Direct Connect connection, create a private VIF. This VIF will be used to connect to your VPC, which is where your S3 bucket resides.
Configure VPC Endpoints:
- Create a VPC endpoint for S3 within your VPC. This endpoint will provide a private IP address that your on- premises network can use to access S3.
Route Traffic:
- Configure your on-premises routers to route traffic destined for the S3 endpoint (via the private IP) through the Direct Connect connection and its private VIF.
Configure On-Premises DNS:
- Ensure your on-premises DNS resolvers are configured to point the S3 domain name to the private IP address of the S3 endpoint.

Ways to establish access to Amazon S3 from on-premises network via Direct Connect:
- Use a public IP address over Direct Connect
- Use a private IP address over Direct Connect (with an interface VPC endpoint)
- Create a private virtual interface for your connection and then create an interface VPC endpoint for S3 in your VPC that is associated with the virtual private gateway.
- The VGW must connect to a Direct Connect private virtual interface.
- This interface VPC endpoint resolves to a private IP address even if you enable a VPC endpoint for S3.
- When you access Amazon S3, you need to use the same DNS name provided under the details of the VPC endpoint.

AWS PrivateLink

AWS PrivateLink is a highly available, scalable technology that enables you to privately connect your VPC to supported AWS services, services hosted by other AWS accounts (VPC endpoint services), and supported AWS Marketplace partner services.
You can create your own application in your VPC and configure it as an AWS PrivateLink-powered service (referred to as an endpoint service).
Other AWS principals can create a connection from their VPC to your endpoint service using an interface VPC endpoint or a Gateway Load Balancer endpoint, depending on the type of service.
You are the service provider, and the AWS principals that create connections to your service are service consumers.

Security Groups vs Network ACLs

For the given use case, the inbound traffic is accepted however the response traffic has been rejected.
Security groups are stateful — this means that responses to allowed traffic are also allowed, even if the rules in your security group do not permit it.
Network ACLs are stateless, therefore responses to allowed traffic are subject to network ACL rules.

AWS Organizations and Service Control Policies (SCPs)

SCPs cannot grant permissions; they are guard rails to control maximum permissions available to users.
SCPs affect root users and IAM users, NOT service linked roles or users and roles from outside of the organization.

AWS Organizations and Trusted Access

You can use trusted access to enable an AWS service that you specify, called the trusted service, to perform tasks in your organization and its accounts on your behalf.
This involves granting permissions to the trusted service but does not otherwise affect the permissions for IAM users or roles.
When you enable access, the trusted service can create an IAM role called a service-linked role in every account in your organization.
That role has a permissions policy that allows the trusted service to do the tasks that are described in that service's documentation.
This enables you to specify settings and configuration details that you would like the trusted service to maintain in your organization's accounts on your behalf.

AWS Storage

EFS cannot be used as an origin for cloudfront
S3 IA-Tiered, ST-IA, 1 zone-IA have a minimum 30-day storage charge and are not cost effective is data is moved out before the 30 days
Snowmobile is recommended for 10PB+, use snowball (edge storage optimized) for lower storage amounts
GP2 is $3iops$ per GB ( $3k$ per TB) up to $260k iOPS$
s3 Glacier Query does not work on compressed objects; S3 standard does
S3 object level logging and event notifications do not allow you to see permissions assigned to an object on put call

AWS Storage Gateway - Tape Gateway

AWS Storage Gateway offers file-based, volume-based, and tape- based storage solutions.
With a tape gateway, you can cost- effectively and durably archive backup data in GLACIER or DEEP_ARCHIVE.
A tape gateway provides a virtual tape infrastructure that scales seamlessly with your business needs and eliminates the operational burden of provisioning, scaling, and maintaining a physical tape infrastructure.
You can run AWS Storage Gateway either on-premises as a VM appliance, as a hardware appliance, or in AWS as an Amazon Elastic Compute Cloud (Amazon EC2) instance.
You deploy your gateway on an EC2 instance to provision iSCSI storage volumes in AWS.
You can use gateways hosted on EC2 instances for disaster recovery, data mirroring, and providing storage for applications hosted on Amazon EC2.

SQS

FIFO queues can handle 300 API calls per second, can batch up to 10 messages for 3000 messages per second
Delay queues postpone the delivery of all messages
Message timers allow you to set an initial invisibility period for a message added to a queue with a max of 15 minutes

Operational Tools

Systems manager is for EC2, on-prem devices (VMs and physical), and edge devices. It is not for any other AWS services
Systems Manager – Aws-ApplyPatchBaseline is for Windows systems only. Use Aws-RunPatchBaseline for linux and Windows systems
AWS Config allows you to track and review changes in configuration, like what did resource X look like at a point in time
OpsWorks is for Chef and Puppet to automate deployments

AWS Config Managed Rules

AWS Config provides AWS managed rules, which are predefined, customizable rules that AWS Config uses to evaluate whether your AWS resources comply with common best practices.
For example, you could use a managed rule to quickly start assessing whether your Amazon Elastic Block Store (Amazon EBS) volumes are encrypted or whether specific tags are applied to your resources.
You can set up and activate these rules without writing the code to create an AWS Lambda function, which is required if you want to create custom rules.
The AWS Config console guides you through the process of configuring and activating a managed rule.
You can also use the AWS Command Line Interface or AWS Config API to pass the JSON code that defines your configuration of a managed rule.

AWS Security

WAF ACL logs cannot go directly to CloudTrail
The aws:PrincipalOrgID global condition key can be used with the Principal element in a resource-based policy with AWS KMS.
Instead of listing all the AWS account IDs in an Organization, you can specify the Organization ID in the Condition element.
Create an AWS KMS key policy to allow all accounts in an AWS Organization to perform AWS KMS actions using the AWS global condition context key aws:PrincipalOrgID.
It is a best practice to grant the least privilege permissions with AWS Identity and Access Management (IAM) policies. Specify your AWS Organization ID in the condition element of the statement to make sure that only the principals from the accounts in your Organization can access the AWS KMS key.
aws:PrincipalOrgID - Use this key to compare the identifier of the organization in AWS Organizations to which the requesting principal belongs with the identifier specified in the policy.
This global key provides an alternative to listing all the account IDs for all AWS accounts in an organization.
You can use this condition key to simplify specifying the Principal element in a resource-based policy.
You can specify the organization ID in the condition element. When you add and remove accounts, policies that include the aws:PrincipalOrgID key automatically include the correct accounts and don't require manual updating.

AWS Application Discovery Service (ADS)

Install the AWS Application Discovery Service on each of the VMs to collect the configuration and utilization data
AWS Application Discovery Service helps enterprise customers plan migration projects by gathering information about their on-premises data centers.
ADS will discover on-premises or other hosted infrastructure. This includes details such as server hostnames, IP and MAC addresses, resource allocation, and utilization details of key resources.
Planning data center migrations can involve thousands of workloads that are often deeply interdependent. Server utilization data and dependency mapping are important early first steps in the migration process.
AWS Application Discovery Service collects and presents configuration, usage, and behavior data from your servers to help you better understand your workloads.
ADS will identify server dependencies by recording inbound and outbound network activity for each server.
ADS will provide details on server performance. It captures performance information about applications and processes by measuring metrics such as host CPU, memory, and disk utilization.
It also will allow you to search in Amazon Athena with predefined queries.
The collected data is retained in encrypted format in an AWS Application Discovery Service data store.
In addition, this data is also available in AWS Migration Hub, where you can migrate the discovered servers and track their progress as they get migrated to AWS.

AWS Server Migration Service (SMS)

AWS SMS combines data collection tools with automated server replication to speed the migration of on-premises servers to AWS.
AWS SMS is an agentless service, which makes it easier and faster for you to migrate thousands of on-premises workloads to AWS.
As of March 31, 2022, AWS has discontinued AWS Server Migration Service (AWS SMS).
Going forward, AWS recommends AWS Application Migration Service (AWS MGN) as the primary migration service for lift-and- shift migrations.

AWS Database

RedShift cannot use auto-scaling
You can only use Amazon S3-managed keys (SSE-S3) encryption (AES-256) for audit logging
RDS read replica failover is not automatic
RDS read replicas can be promoted to a standalone instance
Aurora read replicas can be promoted to the primary instance
RDS read replicas use asynchronous replication
RDS standby instances use synchronous replication
All Aurora instances use synchronous replication
RDS primary and standby instances are upgraded at the same time
All Aurora instances are updated at the same time
Neptune DB is a graph database
DMS instances must be in the same account and region as a RedShift cluster. Redshift must have a security group allowing DMS replication instance inbound connections
DynamoDB charges you for reading, writing, and storing data in your DynamoDB tables. Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. You can use SQS to create a buffer to handle peak load and process the task asynchronously later, allowing you to reduce the provisioned write capacity on DynamoDB and thereby letting you save costs.
Set up a new DynamoDB table each day and drop the table for the previous day after its data is written on S3 - Deleting an entire table is significantly more efficient than removing items one by one, which essentially doubles the throughput requirements as you need to query/scan and then delete each item. This is the fastest and simplest method for the given use case since all the items for the previous day can be deleted from the table for that day, without the need to scan and delete each item. You can configure a process to automatically create a new table daily for handling that day's data.

DynamoDB Time to Live (TTL)

Amazon DynamoDB Time to Live (TTL) allows you to define a per- item timestamp to determine when an item is no longer needed.
Shortly after the date and time of the specified timestamp, DynamoDB deletes the item from your table without consuming any write throughput.
TTL is provided at no extra cost as a means to reduce stored data volumes by retaining only the items that remain current for your workload’s needs.
Amazon Managed Streaming for Apache Kafka (MSK) to ingest and store the records. Set a custom data retention period of 120 days for the Kafka topic. Send the streamed data to an Amazon S3 bucket for added durability is incorrect because it may not be the most efficient or cost-effective solution. While Amazon MSK is capable of ingesting and storing large volumes of data in real-time, it’s not optimized for low-latency data retrieval, which is a key requirement in the scenario. Moreover, the additional step of sending the streamed data to an Amazon S3 bucket for added durability might be unnecessary and could incur additional costs.

DynamoDB Auto Scaling

Auto Scaling for DynamoDB helps automate capacity management for your tables and global secondary indexes.
You simply specify the desired target utilization and provide upper and lower bounds for read and write capacity.
DynamoDB will then monitor throughput consumption using Amazon CloudWatch alarms and then will adjust provisioned capacity up or down as needed.
Amazon DynamoDB auto scaling uses the AWS Application Auto Scaling service to dynamically adjust provisioned throughput capacity on your behalf, in response to actual traffic patterns.
This enables a table or a global secondary index to increase its provisioned read and write capacity to handle sudden increases in traffic, without throttling.
When the workload decreases, Application Auto Scaling decreases the throughput so that you don't pay for unused provisioned capacity.

EC2

When an EC2 instance is unhealthy, autoscaling terminates an instance then later does a scaling activity to replace the instance
Global Accelerator is for ALBs, NLBs, EC2, or Elastic IPs only
NLB does not support least outstanding requests routing, it uses flowhash routing
ALBs cannot have static IPs, for that you would need to put an NLB with a static IP infront of the ALB
An Application Load Balancer does not support a TCP health check. ALB only supports HTTP and HTTPS target health checks.
Network ACLs affect all the subnets associated with it. If there is a misconfigured rule, the other subnets will be affected too
The NAT gateway should be placed in a public subnet because it needs a Public IP address and a direct route to the Internet Gateway (IGW). If it is placed on a private subnet, it will have the same routing limitation as those resources in the private subnet.

AWS Logging

CloudTrail is for review/monitoring of management activities of all AWS account to CloudWatch logs using a KMS key.
Data Events can also be logged but are not enabled by default; ie s3 API calls and Lambda execution activity

CloudFront

Use Signed cookies to restrict access to multiple files
Use Signed URLs to restrict access to individual files
You cannot use signed cookies or URLs if your query strings already contain: Expires, Policy, Signature, KeyPairId
You can configure CloudFront to add custom headers to the requests that it sends to your origin.
These custom headers enable you to send and gather information from your origin that you don’t get with typical viewer requests.
These headers can even be customized for each origin.
CloudFront supports custom headers for both custom and Amazon S3 origins.
Caching API requests should be done on the API Gateway, and not on CloudFront. DynamoDB Accelerator is used for caching requests if you need response times in microseconds, this is very expensive.

Route 53 Health Checks

Amazon Route 53 health checks monitor the health and performance of your web applications, web servers, and other resources.
Each health check that you create can monitor one of the following:
- The health of a specified resource, such as a web server.
- The status of other health checks.
- The status of an Amazon CloudWatch alarm.
- Additionally, with Amazon Route 53 Application Recovery Controller, you can set up routing control health checks with DNS failover records to manage traffic failover for your application.
For the given use case, you need to create a CloudWatch metric that checks the status of the EC2 StatusCheckFailed metric, add an alarm to the metric, and then create a health check that is based on the data stream for the alarm.
To improve resiliency and availability, Route 53 doesn't wait for the CloudWatch alarm to go into the ALARM state. The status of a health check changes from healthy to unhealthy based on the data stream and the criteria in the CloudWatch alarm.
For EC2 instances, always use a Type A Record without an Alias.
For ELB, Cloudfront, and S3, always use a Type A Record with an Alias, and finally, for RDS, always use the CNAME Record with no Alias.
From a shared services VPC (RAM), peered VPCs must be associated with the private hosted zone in the shared VPC to enable the peered VPCs to resolve hosts in the shared VPC.

Scenario: Migrating Web System to AWS

A company is migrating an interactive car registration web system from on-premises to AWS.
Current architecture: Single NGINX web server and MySQL database on a Fedora server.
New cloud architecture: Load balancer for traffic distribution, Route 53 for domain registration and management.
Question: Most efficient way to transfer the web application to AWS?
Multi-AZ deployment configuration uses multiple Availability Zones.
Incorrect options involve launching a multi-AZ MySQL Amazon RDS instance in one availability zone only.

Solution:

Launch two NGINX EC2 instances in two Availability Zones.
Copy web files from the on-premises web server to each Amazon EC2 web server, using Amazon S3 as the repository.
Migrate the database using the AWS Database Migration Service.
Create an ELB to front your web servers.
Use Route 53 and create an alias A record pointing to the ELB.

Alias record: Route 53 extension to DNS, similar to CNAME record but can be created for the root domain (example.com) and subdomains (www.example.com).
Take note as well that the AWS Application Migration Service (MGN) is primarily used to migrate virtual machines only, which can be from VMware vSphere and Windows Hyper-V to your AWS cloud.
In addition, the AWS Application Discovery Service simply helps you to plan migration projects by gathering information about your on- premises data centers, but this service is not a suitable migration service.
Web system being migrated is a non-static website, which cannot be hosted in S3

Scenario: Federated Access with LDAP

Company uses LDAP for employee authentication and authorization, planning a mobile app with federated access to AWS.
Requirements: Custom-built solution for user authentication, IAM roles for granting permissions.
Options to implement:
- Build a custom SAML-compatible solution to handle authentication and authorization.
  - Configure the solution to use LDAP for user authentication and use SAML assertion to perform authorization to the IAM identity provider.
- Build a custom OpenID Connect-compatible solution for the user authentication functionality.
  - Use Amazon Cognito Identity Pools for authorizing access to AWS resources.
SAML assertion needed to get authorization tokens from IAM identity provider, granting IAM roles to users.custom OpenID Connect-compatible solution will allow users to log in from their mobile application much like a single sign-on functionality. Amazon Cognito Identity Pool will provide temporary tokens to federated users for accessing AWS resources.

Scenario: Encrypting Direct Connect Traffic

Firm hosts its legacy application on Amazon EC2 in a private subnet of its Amazon VPC.
The application is accessed by the employees from their corporate laptops through a proprietary desktop program.
The company network is peered with the AWS Direct Connect (DX) connection to provide a fast and reliable connection to the private EC2 instances inside the VPC.
Requirement is to encrypt its network traffic that flows from the employees' laptops to the resources inside the VPC while maintaining the consistent network performance of Direct Connect
Solution: Using the current Direct Connect connection, create a new public virtual interface, and input the network prefixes that you want to advertise. Create a new site-to-site VPN connection to the VPC with the BGP protocol using the DX connection. Configure the company network to route employee traffic to this VPN.
- Private virtual interface: Connect VPC resources on their private IP address or endpoint, connect to multiple VPCs in any AWS Region
- Public virtual interface: Connect to all AWS public IP addresses globally, create public virtual interfaces in any DX location to receive Amazon’s global IP routes, access publicly routable Amazon services in any AWS Region (except for the AWS China Region).

Scenario: Multi-Account DNS Management

Company wants to implement a multi-account strategy across several research facilities with 50 teams needing their own AWS accounts.
Requirement: Simplify DNS management, allowing private DNS to be shared among VPCs in different AWS accounts.
Solution: The correct answer is: On AWS Resource Access Manager (RAM), set up a shared services VPC on your central account. Set up VPC peering from this VPC to each VPC on the other accounts. On Amazon Route 53, create a private hosted zone associated with the shared services VPC. Manage all domains and subdomains on this zone. Programmatically associate the VPCs from other accounts with this hosted zone.
You can simplify network topologies by interconnecting shared Amazon VPCs using connectivity features, such as AWS PrivateLink, AWS Transit Gateway, and Amazon VPC peering.
VPC sharing allows customers to share subnets with other AWS accounts within the same AWS Organization.

Scenario: Mitigating DDoS Attacks

Company wants to implement a multi-account strategy across several research facilities with 50 teams needing their own AWS accounts.
Question: Which options help mitigate Distributed Denial of Service (DDoS) attacks for cloud infrastructure hosted in AWS
The following options are the correct answers in this scenario as they can help mitigate the effects of DDoS attacks:

Use an Amazon CloudFront distribution for both static and dynamic content of your web applications. Add CloudWatch alerts to automatically look and notify the Operations team for high CPUUtilization and NetworkIn metrics, as well as to trigger Auto Scaling of your EC2 instances.
- Amazon CloudFront is a content delivery network (CDN) service that can be used to deliver your entire website, including static, dynamic, streaming, and interactive content.
- Persistent TCP connections and variable time-to-live (TTL) can be used to accelerate delivery of content, even if it cannot be cached at an edge location. This allows you to use Amazon CloudFront to protect your web application, even if you are not serving static content.
- Amazon CloudFront only accepts well-formed connections to prevent many common DDoS attacks like SYN floods and UDP reflection attacks from reaching your origin.
Use an Application Load Balancer (ALB) to reduce the risk of overloading your application by distributing traffic across many backend instances. Integrate AWS WAF and the ALB to protect your web applications from common web exploits that could affect application availability.
- With Elastic Load Balancing (ELB), you can reduce the risk of overloading your application by distributing traffic across many backend instances. ELB can scale automatically, allowing you to manage larger volumes of unanticipated traffic, like flash crowds or DDoS attacks
- In the case of web applications, you can use ELB to distribute traffic to many Amazon EC2 instances that are overprovisioned or configured to auto scale for the purpose of serving surges of traffic, whether it is the result of a flash crowd or an application layer DDoS attack.
  *Larger DDoS attacks can exceed the size of a single Amazon EC2 instance. To mitigate these attacks, you will want to consider options for load balancing excess traffic.

Amazon CloudWatch alarms are used to initiate Auto Scaling, which automatically scales the size of your Amazon EC2 fleet in response to events that you define.

Scenario: Optimizing Serverless Application Performance

Company has a serverless forex trading application built using AWS SAM, hosted on AWS Serverless Application Repository.
User complaints: Slow logins, occasional HTTP 504 errors.
You are tasked to optimize the system and to significantly reduce the time to log in to improve the customers' satisfaction
Solution: Implement these options in order to improve the performance of the application with minimal cost

Configure an origin failover by creating an origin group with two origins with one as the primary origin and the other as the second origin which CloudFront automatically switches to when the primary origin fails. This will alleviate the occasional HTTP 504 errors that users are experiencing.
Use Lambda@Edge to allow your Lambda functions to customize the content that CloudFront delivers and to execute the authentication process in AWS locations closer to the users

Scenario: Scalable HPC for Image Processing

Company processes petabytes of images monthly using an on-premises HPC cluster.
Data center nearing capacity, jobs spread throughout the month.
Task: Design a scalable solution that can exceed current capacity with least overhead, maintaining durability, while being cost-effective.
Solution: Utilize AWS Batch with Managed Compute Environments to create a fleet using Spot Instances. Store the raw data on an Amazon S3 bucket. Create jobs on AWS Batch Job Queues that will pull objects from the Amazon S3 bucket and temporarily store them to the EC2 EBS volumes for processing. Send the processed images back to another Amazon S3 bucket.
AWS Batch: Enables developers, scientists, and engineers to run hundreds of thousands of batch computing jobs on AWS easily and efficiently.

Scenario: Improving Web Application Availability

Company hosts a web application on an Auto Scaling group of Amazon EC2 instances deployed across multiple Availability Zones.
A recent outage caused a major loss to the company's revenue. Upon investigation, it was found that the web server metrics are within the normal range but the database CPU usage is very high, causing the EC2 health checks to timeout causing the downtime
Question: Which of the following options should the Solution Architect implement to prevent this from happening again and allow the application to handle more traffic in the future?
Solution:
Change the target group health check to a simple HTML page instead of a page that queries the database. Create an Amazon Route 53 health check for the database dummy item web page to ensure that the application works as expected. Set up an Amazon CloudWatch alarm to send a notification to Admins when the health check fails is correct.
Since this is a retail web application, most of the queries will be read-intensive as customers are searching for products. ElastiCache is effective at caching frequent requests, which overall improves the application response time and reduces database queries.

Scenario: Troubleshooting Third-Party API Calls

Users reporting that on the website, when the find a nearby store function is clicked on the website, the map loads only about 50% of the time when the page is refreshed. This function involves a third-party RESTful API call to a maps provider. Amazon EC2 NAT instances are used for these outbound API calls.
Error likely caused by a failed NAT instance in one of the public subnets. Use NAT Gateways instead of EC2 NAT instances to ensure availability and scalability. This is very likely as we have two subnets in the scenario and NAT instances reside in only one AZ.
NAT instances are managed by the customer so if the instance goes down, there could be a potential impact on the availability of your application.

Scenario: Automating Catalog System with Facial Recognition

Company wants to implement an automated catalog system for over 50-TB digital videos and media files which are stored on their on-premises tape library and are used by their Media Asset Management (MAM) system.
This would enable them to search their files using facial recognition.
They also want to migrate media files to AWS including the MAM video contents.
Solution: Integrate the file system of your local data center to AWS Storage Gateway by setting up a file gateway appliance on-premises. Utilize the MAM solution to extract the media files from the current data store and send them into the file gateway. Build a collection using Amazon Rekognition by populating a catalog of faces from the processed media files. Use an AWS Lambda function to invoke Amazon Rekognition Javascript SDK to have it fetch the media file from the S3 bucket which is backing the file gateway, retrieve the needed metadata, and finally, persist the information into the MAM solution.
You can use the facial information that's stored in a collection to search for known faces in images, stored videos, and streaming videos
Amazon Rekognition supports the IndexFaces operation. You can use this operation to detect faces in an image and persist information about facial features that are detected in a collection

Scenario: Modernizing Image Catalog with Best Option to Achieve the Requirement

Company wants to move the application to AWS to produce a scalable, durable, and highly available architecture in addition to being low cost.
Question: Which of the following is the single best option to achieve this requirement
Answer: Create a new S3 bucket to store and serve the scanned image files using a CloudFront web distribution. Launch a new Elastic Beanstalk environment to host the website across multiple Availability Zones and set up an Amazon OpenSearch Service for query processing, which the website can use. Use Amazon Textract to detect and recognize text from scanned old newspapers. It satisfies the requirement given in the scenario, i.e., it uses S3 to store the images instead of the commercial product, which will be decommissioned soon. More importantly, it uses Amazon OpenSearch Service for query processing, and in addition, it uses Multi-AZ implementation, which provides high availability. It is also correct to use Amazon Textract to detect and recognize text from scanned old newspapers.
Amazon OpenSearch Service enables you to quickly add rich search capabilities to your website or application. You don't need to become a search expert or worry about hardware provisioning, setup, and maintenance
Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents.

Scenario: Data Storage and Retrieval Solution for Receiving Application

Tech company builds a new smartwatch which collects usage statistics and usage information from worldwide user base
Question: Given the following requirements which is the recommended solution while being the most cost-effective?
Each record is less than 4KB in size Data must be stored durably Data must be stored for 120 days only, then it can be deleted Data must have low latency retrieval time,For running the application for a year, the estimated storage requirement is around 10 -15 TB
Answer: Configure the application to receive the records and set the storage to a DynamoDB table. Configure proper scaling on the DynamoDB table and enable the DynamoDB table Time to Live (TTL) setting to delete records after 120 days.
DynamoDB with a feature to delete items after a defined timestamp, Cost-effectiveness,It does not consume any write throughput.

Scenario: Fargate task issue

Question: Given the below issue what is the fix
CannotPullContainerError: API error (500): Get https://111122223333.dkr.ecr.us-east-1.amazonaws.com/v2/: net/http: request canceled while waiting for connection"
Answer: Update the AWS Fargate task definition and set the auto-assign public IP option to DISABLED. Launch a NAT gateway on the public subnet of the VPC and update the route table of the private subnet to route requests to the internet.
The NAT gateway on the public subnet should have a public IP address and a route to the