1/46
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
what are the 6 Advantages of AWS Cloud Computing?
1. trade capital expense for variable expense: CAPEX -> OPEX. instead of purchasing hardware upfront, you can pay for compute capacity by the hour.
2. Benefit from massive economies of scale: AWS has a large and diverse customer base, allowing it to achieve lower costs and pass those savings on to customers.
3. Stop guessing capacity: with AWS, you can quickly scale up or down based on demand, eliminating the need to over-provision resources.
4. Increase speed and agility: AWS allows you to quickly deploy and manage applications, reducing the time it takes to bring new products and services to market. speed -> speed in deployment, Agility -> react to change
5. Stop spending money running and maintaining data centers: AWS manages the infrastructure, allowing you to focus on your core business.
6. Go global in minutes: AWS has a global network of data centers, allowing you to deploy applications and services in multiple regions around the world with just a few clicks.
what are different deployment models?
Cloud deployment models determine how cloud services are made available to users. Each model comes with its own advantages and limitations depending on the use case.
🏠 1. Private Cloud
📌 Used exclusively by a single organization
Hosted either on-premises or by a third-party provider
Offers greater control, security, and customization
Commonly used by organizations that must meet regulatory or data sovereignty requirements
💡 Think: A custom-built house with tight security — all for one family (your company).
🌐 2. Public Cloud
📌 Services offered over the public internet by third-party providers like AWS, Azure, or Google Cloud.
Shared infrastructure across multiple customers (tenants)
Offers benefits such as:
Scalability
Elasticity
Cost-effectiveness
Reliability
Security
Global reach
💡 Think: Renting a modern apartment — everything’s ready and maintained for you, and you only pay for what you use.
🔄 3. Hybrid Cloud
📌 Combines private and public cloud infrastructure
Organizations use private cloud for sensitive workloads
Leverage public cloud for scalable and cost-efficient resources
Enables flexibility in managing data and applications
💡 Best of both worlds: Store valuables at home, but shop from the cloud when you need scale.
what are different Cloud Service Models?
These models define what level of control and responsibility you have over cloud-based resources.
1⃣ Infrastructure as a Service (IaaS)
Offers basic infrastructure resources such as:
Virtual machines (compute)
Storage
Networking
Gives users maximum flexibility and control
You are responsible for managing:
Operating systems
Applications
Runtime environments
🛠 Examples: Amazon EC2, Azure Virtual Machines, Google Compute Engine
2⃣ Platform as a Service (PaaS)
Offers a platform and environment to build, test, and deploy applications
Abstracts away infrastructure management
Developers can focus on application logic
🛠 Examples: AWS Elastic Beanstalk, Azure App Services, Google App Engine
3⃣ Software as a Service (SaaS)
Fully managed software that is delivered over the internet
Users access it via web or apps
No infrastructure or platform concerns
🛠 Examples: Google Workspace, Microsoft 365, Salesforce
💡 You're just a user. Log in and go.
what are AWS Pricing Models?
Control your cost, scale your success.
AWS uses a pay-as-you-go pricing model that gives customers the flexibility to scale resources and pay for only what they use.
💸 1. Pay as You Go
No upfront investment
Pay only for individual services used
Pricing is based on actual resource consumption
🧾 Example: You launch an EC2 instance for 3 hours → you’re billed for 3 hours, not a full month.
📆 2. Save When You Reserve
Commit to 1 or 3 years of usage
Up to 75% savings compared to on-demand pricing
Ideal for predictable workloads
🛠 Examples of reserved services:
EC2 Reserved Instances
DynamoDB Reserved Capacity
RDS Reserved Instances
Redshift Reserved Instances
ElastiCache Reserved Nodes
📌 Think of it like buying a metro pass instead of individual tickets — cheaper if you’re a regular user.
📈 3. Pay Less by Using More
Volume-based discounts
The more resources you consume, the lower the cost per unit
Common with services like S3, EC2, and CloudFront
🔁 4. Pay Less as AWS Grows
AWS passes on operational efficiencies to customers
Price reductions over time due to improved economies of scale and tech innovation
💡 You're benefiting from Amazon’s growth, even if your usage stays the same.
what are AWS Regions? How to choose one?
AWS Regions
A region is a cluster of data centers.
AWS has Regions all around the world. Names can be us-east-1, eu-west-3, etc.
Most AWS services are region-scoped.
Choosing an AWS Region depends on:
Compliance - with data governance ensured; data stays within region unless explicitly permitted.
Proximity - to customers minimizes latency.
Service availability - varies by region; not all services are available in every region.
Pricing - varies by region and is transparent on the service pricing page.
what are AWS Availability zones? AWS Data Centers?
AWS Availability Zones
Each region consists of multiple Availability Zones (usually 3, min is 3, max is 6).
Example: ap-southeast-2a, ap-southeast-2b, ap-southeast-2c.
Each AZ comprises one or more discrete data centers with redundant power, networking, and connectivity.
AZs are isolated from each other to ensure resilience against disasters.
Connected by high-bandwidth, low-latency networking for seamless communication
AWS Data Centers
Located in regions worldwide to reduce latency and improve performance.
Security: Equipped with advanced security measures to protect data and infrastructure.
Scalability: Designed to easily scale resources to meet growing demands.
Redundancy: Built with redundant power and cooling systems to ensure uptime.
Compliance: Adhere to strict compliance standards for data privacy and security.
Energy Efficiency: Utilize green technologies to minimize environmental impact.
what we mean by Shared Responsibility Model?
The Shared Responsibility Model is a model referring to protection of your cloud computing resources, whose responsibility is shared between you (the customer) and AWS.
AWS is responsible for:
Security of the Cloud: AWS is responsible for protecting the infrastructure that runs its services, including hardware, software, networking, and facilities.
Includes physical security, environmental controls, and maintaining core infrastructure services.
The Customer is responsible for:
Security in the Cloud: Customers are responsible for securing their data, applications, and identity management within the cloud environment.
Includes securing data, managing access controls, and configuring security settings for individual services.
Collaboration:
Both AWS and customers must work together to ensure a secure and compliant cloud environment.
What’s Cloud Adoption Framework(CAF)?
The AWS Cloud Adoption Framework (AWS CAF) = assists in creating and executing a comprehensive plan for digital transformation using AWS.
Developed by AWS professionals, it incorporates AWS best practices and insights from thousands of customers.
AWS CAF identifies key organizational capabilities essential for successful cloud transformations.
It organizes these capabilities into six perspectives:
Business, People, Governance, Platform, Security, and Operations.
How to Controlling access to your serverless API?
Controlling access to your serverless API:
APIs are common attack targets due to the valuable operations and data they expose. To secure access:
Use Amazon Cognito user pools for user authentication.
Implement API Gateway Lambda authorizers to control authorization.
Apply API Gateway resource policies to define who can access the API. Understanding and configuring these mechanisms correctly is essential to secure your API.
How to Control access to your serverless application? what are best practices for: Data Protection? Failure Management? Cost-effective Resources? Optimization?
When configuring AWS Lambda, follow least-privileged access principles. Grant only the permissions required for each function's operation to avoid unnecessary risks.
Smaller, well-scoped functions are also recommended to enhance security and maintain a well-architected application.
In this course, you learned about a number of AWS services that support these best practices, including AWS Lambda, and API Gateway, and later you will learn about Amazon Cognito.
Best practice for "Data Protection”
rotect data at all times to prevent unauthorized access. Follow these Well-Architected Framework recommendations:
1. Encrypt Data in Transit and at Rest
API Gateway: Encrypt sensitive data client-side before sending HTTP requests. Avoid exposing sensitive information in request headers.
Lambda & API Integrations: Encrypt data before processing to prevent exposure in logs (Amazon CloudWatch Logs) or persistent storage.
Storage Services (S3, DynamoDB, OpenSearch): Enable encryption at rest. Avoid logging or storing sensitive data in unencrypted form.
2. Implement Application Security
Input Validation: Validate inbound events to prevent malicious input.
API Gateway Validation: Use built-in request validation for JSON schema, URL parameters, and headers.
Advanced Validation: Use Lambda functions, libraries, or external services for deeper security checks.
4o
Best practice for "Failure Management”
Use a Dead-Letter Queue (DLQ) Mechanism
AWS Lambda: Configure a dead-letter queue (DLQ) in Amazon SQS to capture failed transactions for analysis and retries.
Amazon Kinesis & DynamoDB Streams: When a batch processing error occurs, the entire batch is retried, which may block processing. Configure Lambda to isolate "poison-pill" messages by forwarding them to a DLQ to prevent system-wide failures.
2. Roll Back Failed Transactions
AWS Step Functions: Use state machines to automate rollback procedures when a failure occurs.
Transaction Consistency: Ensure workflows either fully complete or properly revert to maintain data integrity.
Best practice for "Cost-effective Resources”
Use AWS Lambda efficiently: Since Lambda's CPU, network, and storage IOPS scale with memory allocation, optimize memory settings to balance cost and performance.
Reduce cold starts: Optimize function initialization to lower execution time, as Lambda charges in 1-ms increments.
Minimize unnecessary function calls: Use direct integrations with AWS services to cut costs.
Best practice for "Optimization”
Use Direct AWS Service Integrations
Avoid unnecessary Lambda functions: If a function only forwards data, replace it with a direct service integration (e.g., API Gateway → Kinesis Data Firehose).
Reduce operational overhead: Leverage built-in AWS Step Functions, EventBridge, and Lambda Destinations instead of writing custom logic.
what's Amazon EC2? what are sizing & configurations common in that?
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It allows users to rent virtual servers, known as instances, to run applications and services.
EC2 offers a variety of instance types, operating systems, and configurations to meet different computing needs. Users can easily scale their compute resources up or down based on demand, and they only pay for the compute capacity they use. EC2 also integrates with other AWS services, making it a flexible and powerful option for deploying applications in the cloud.
it's considered IaaS (Infrastructure as a Service).
common sizing & configurations:
1. Operating system: Linux, Windows, macOS
2. How much power & cores (CPU)
3. How much Random Access Memory (RAM)
4. How much storage space & type:
- Network Attached Storage (NAS)(EBS, EFS)
- hardware (EC2 Instance store)
5. Network Card: speed of the card, Public IP Address
6. Firewall rules: security group.
7. Bootstrap script: runs on first launch (EC2 User Data)
is there instances from AWS For free Tier?
yes, t2.micro and t3.micro instances are available for free tier usage.(750 hours per month for the first 12 months)
What are Amazon EC2 Instances?
Amazon EC2 Instances = a virtual server in the AWS Cloud. When you launch an EC2 instance, the instance type that you specify determines the hardware available to your instance.
Allows users to rent virtual servers (known as instances) on which they can run their own applications.
Each instance type offers a different balance of compute, memory, network, and storage resources.
Infrastructure as a Service
Amazon EC2 Capabilities?
EC2 Capabilities
Offers a wide range of instance types optimized for various use cases, such as general-purpose computing, memory-intensive workloads, and GPU processing.
Renting virtual machines (EC2)
Storing data on virtual drives (EBS)
Distributing load across machines (ELB)
Scaling the services using an auto-scaling group (ASG)
Don’t worry, you will learn all of the above in the upcoming lectures 😄
Provides flexibility to scale compute capacity up or down based on demand, enabling cost optimization and performance scalability.
Supports a variety of operating systems and software configurations, allowing users to customize their computing environment according to their requirements.
Widely used for hosting websites, running applications, performing data processing tasks, and more, making it a fundamental component of cloud computing.
what’s User Data Script?
EC2 instances can be configured at launch using User Data scripts.
Bootstrapping = refers to executing commands upon a machine's startup.
The User Data script executes only once during the initial startup of the instance.
EC2 User Data is utilized for automating startup tasks, including:
Installing updates
Installing software
Downloading files from the internet
Any other tasks you might require
This user data script creates a server and static web page when the EC2 instance is first created.
what’s EC2 Instance Types - General Purpose suitable for? Compute Optimized? Memory Optimized? Storage Optimized?
EC2 Instance Types - General Purpose
Great for diverse workloads like web servers and code repositories
EC2 Instance Types - Compute Optimized
Ideal for compute-intensive tasks demanding high-performance processors
EC2 Instance Types - Memory Optimized
Provides fast performance for workloads processing large data sets in memory
EC2 Instance Types - Storage Optimized
Ideal for storage-intensive tasks with high, sequential read and write access to large data sets on local storage
what are amazon EC2 pricing options?
1. On-Demand Instances: Pay for compute capacity by the hour or second, with no long-term commitments. This option is ideal for applications with unpredictable workloads or short-term needs.
2. Reserved Instances: Commit to using a specific instance type in a specific region for a one- or three-year term, in exchange for a significant discount compared to on-demand pricing. This option is ideal for applications with steady-state usage.
3. Spot Instances: Bid on unused EC2 capacity at a potentially lower price than on-demand instances. This option is ideal for applications with flexible start and end times.
4. Dedicated Instances: Run instances on hardware that is dedicated to a single customer, providing additional isolation and security. This option is ideal for applications with strict compliance requirements.
5. Dedicated Hosts: Physical servers dedicated to your use, providing visibility and control over how instances are placed on the server. This option is ideal for applications with specific licensing requirements.
6. Savings Plans: Flexible pricing model that offers lower prices on compute usage in exchange for a commitment to a consistent amount of usage (measured in $/hour) for a one- or three-year term. This option is ideal for applications with predictable workloads.
how linux billing is different from windows billing? what about volumes?
Linux instances are typically billed at a lower rate than Windows instances due to the licensing costs associated with Windows. Additionally, Linux instances may offer more flexibility in terms of customization and configuration, which can impact billing based on usage patterns.
linux, ubuntu are billed per second, minimum 1 second. Windows, linux distributions are billed per hour, minimum 1 hour.
volumes are billed in seconds, minimum for 1 minute.
what are 2 types of reserved instance? what's difference? what payment models available? what's difference between them? when would one get RI discount? what do you know about scheduled reserved instances?
2 types of reserved instance:
1. Standard Reserved Instances: These offer the highest discount (up to 72%) compared to on-demand pricing, in exchange for a one- or three-year commitment. They are best suited for steady-state workloads where you can commit to using a specific instance type in a specific region.
2. Convertible Reserved Instances: These offer more flexibility than Standard Reserved Instances, allowing you to change the instance type, operating system, or tenancy over the term of the reservation. They provide a lower discount (up to 54%) compared to Standard Reserved Instances but are ideal for applications with changing requirements.
Payment models available:
1. All Upfront: Pay for the entire reservation term (one or three years) at the time of purchase, in exchange for the highest discount.
2. Partial Upfront: Pay a portion of the reservation cost upfront and the remaining balance in monthly installments over the term.
3. No Upfront: Pay for the reservation on a monthly basis over the term, with no upfront payment required.
The main difference between these payment models is the timing of the payments and the associated discounts. All Upfront offers the highest discount, while No Upfront provides the most flexibility in terms of cash flow.
one gets RI discount when attributes match (instance type, region, tenancy, etc.)
scheduled reserved instances: These allow you to reserve capacity for specific time periods, such as certain days of the week or hours of the day, over a one-year term. They are ideal for workloads that have predictable usage patterns, such as batch processing or scheduled tasks. Scheduled Reserved Instances provide a discount compared to on-demand pricing, but they require a commitment to using the reserved capacity during the specified time periods. to get discount of RI it must exceeds 1200 hours per year.
what we mean by EC2 Saving plans?
1. Compute Savings Plans: These offer the most flexibility, allowing you to change instance types, regions, and operating systems while still benefiting from the savings plan. They provide savings of up to 66% compared to on-demand pricing.
2. EC2 Instance Savings Plans: These are more restrictive, providing savings of up to 72% compared to on-demand pricing, but they require you to commit to a specific instance family within a specific region.
what is meant by spot instances?
1. spot Instance: A spot instance is a type of Amazon EC2 instance that allows you to bid on unused EC2 capacity at a potentially lower price than on-demand instances. Spot instances are ideal for workloads that are flexible in terms of when they can run and can tolerate interruptions.
2. Spot Fleet: A Spot Fleet is a collection of spot instances that are managed as a single entity. It allows you to specify the desired capacity and instance types, and it automatically provisions and manages the spot instances to meet your requirements.
3. EC2 Fleet: An EC2 Fleet is a collection of EC2 instances that can include a mix of on-demand, reserved, and spot instances. It allows you to manage and scale your instances more easily by treating them as a single resource pool.
what are security groups? does blocked traffic reach instance? by default? all outbound traffic is allowed? outbound traffic is allowed?
Security groups serve as a "firewall" for EC2 instances.
They regulate:
Access to specific ports
Authorized IP ranges for both IPv4 and IPv6
Control of inbound traffic (from others to the instance)
Control of outbound traffic (from the instance to others)
Can be attached to multiple instances.
Restricted to a specific region and VPC combination.
Operates externally to the EC2 instance; blocked traffic does not reach the instance.
Advisable to maintain a separate security group specifically for SSH access.
If your application times out and is not accessible, it's likely a security group configuration issue.
If you receive a "connection refused" error, the issue may be with the application itself or it may not have been started.
By default, all inbound traffic is blocked.
By default, all outbound traffic is allowed.
do you know common ports that are used? (will not be in Test)
🗣 Please note: this is good information to know, but you won’t be tested on these ports during your exam!
22: SSH (Secure Shell) - used for logging into a Linux instance.
21: FTP (File Transfer Protocol) - used for uploading files to a file share.
22: SFTP (Secure File Transfer Protocol) - used for securely uploading files via SSH.
80: HTTP - used for accessing unsecured websites.
443: HTTPS - used for accessing secured websites.
3389: RDP (Remote Desktop Protocol) - used for logging into a Windows instance.
what’s Elastic Block Store(EBS) Volumes? is it physical drive? Can I attach it to any instance(In zone, out of zone)?
Elastic Block Store (EBS) Volumes
An EBS (Elastic Block Store) Volume is a network drive that can be attached to instances while they run.
It allows instances to persist data even after termination.
At the CCP level, they can only be mounted to one instance at a time.
EBS volumes are bound to a specific availability zone.
They can be thought of as a "network USB stick."
The free tier includes 30 GB of free EBS storage of type General Purpose (SSD) or Magnetic per month.
EBS (Elastic Block Store) is a network drive, meaning it's not a physical drive and uses the network to communicate with the instance, which may result in some latency.
It can be detached from one EC2 instance and attached to another quickly.
EBS volumes are locked to an Availability Zone (AZ); for example, a volume in us-east-1a cannot be attached to an instance in us-east-1b.
what’s Amazon Machine Image(AMI)
Amazon Machine Image (AMI) = customization of an EC2 instance
Can add your own software, configuration, operating system, monitoring…
Faster boot / configuration time because all your software is pre-packaged
AMI are built for a specific region (and can be copied across regions)
You can launch EC2 instances from:
A Public AMI: AWS provided
Your own AMI: you make and maintain them yourself
An AWS Marketplace AMI: an AMI someone else made (and potentially sells)
Community AMI: These are AMIs that comes from AWS users, and are not verified by AWS
Start an EC2 instance and customize it according to your needs.
Stop the instance to ensure data integrity before creating an image.
Build an AMI (Amazon Machine Image) from the stopped instance, which will also create EBS snapshots.
Launch new instances using the custom AMI or other available AMIs.
what’s EC2 Instance Store? EFS - Elastic File System? EFS Infrequent Access (EFS-IA)?
EC2 Instance Store
EBS volumes are network drives that offer good but "limited" performance.
For high-performance hardware disks, use EC2 Instance Store, which provides better I/O performance.
EC2 Instance Store volumes are ephemeral, meaning they lose their storage if the instance is stopped.
They are suitable for buffer, cache, scratch data, or temporary content.
There is a risk of data loss if the hardware fails.
Managing backups and replication is your responsibility.
Managed NFS (network file system) that can be mounted on 100s of EC2
EFS works with Linux EC2 instances in multi-AZ
Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
EFS Infrequent Access (EFS-IA)
EFS Infrequent Access (EFS-IA) is a storage class optimized for cost, designed for files not accessed daily.
Offers up to 92% lower cost compared to EFS Standard.
EFS automatically moves files to EFS-IA based on the last access time.
Enable EFS-IA with a Lifecycle Policy, such as moving files not accessed for 60 days to EFS-IA.
The transition to EFS-IA is transparent to applications accessing EFS.
Scalability vs Elasticity (vs Agility)
Scalability vs Elasticity (vs Agility)
Scalability: The ability to accommodate a larger load by either enhancing the hardware (scale up) or adding nodes (scale out).
Elasticity: In a scalable system, elasticity refers to the automatic scaling based on the load, enabling a pay-per-use model, demand matching, and cost optimization.
Agility: Unrelated to scalability, agility implies that new IT resources can be provisioned quickly, reducing the time to availability from weeks to minutes.
what are types of Load Balancers?
1. classic Load Balancer (CLB): The Classic Load Balancer is the original load balancer offered by AWS. It operates at both the request level and connection level, distributing incoming traffic across multiple EC2 instances. It is suitable for applications that were built within the EC2-Classic network.
2. Application Load Balancer (ALB): The Application Load Balancer operates at the application layer (Layer 7) and is designed to handle HTTP and HTTPS traffic. It provides advanced routing capabilities, allowing you to route requests based on content, such as URL paths or HTTP headers. ALBs are ideal for microservices architectures and containerized applications.
3. Network Load Balancer (NLB): The Network Load Balancer operates at the transport layer (Layer 4) and is designed to handle TCP traffic. It provides ultra-low latency and can scale to millions of requests per second. NLBs are ideal for applications that require high performance and can tolerate some level of connection interruption.
4. Gateway Load Balancer (GWLB): The Gateway Load Balancer is designed to handle traffic for virtual appliances, such as firewalls and intrusion detection systems. It operates at the network layer (Layer 3) and provides a transparent way to insert and manage these appliances into your network traffic flow.
if we need to balance load across multiple applications on the same machine?
does Load Balancing Direction happens in Application? does ALB support HTTPS? WebSockets? How Application can get client IP?
what's difference between ALB and NLB?
can NLB see directly client IP?
we use Application Load Balancer (ALB) because it can route traffic based on URL paths or hostnames, allowing you to direct requests to different applications running on the same EC2 instance.
Generally, it's Great for microservices architectures and containerized applications(Docker, Amazon ECS).
Load balancing direction happens at the load balancer level, not within the application itself. The load balancer distributes incoming traffic across multiple instances based on the configured routing rules.
yes, ALB supports HTTP, HTTPS and WebSockets. Applications can get the client IP address by examining the X-Forwarded-For header, which is added by the load balancer.
The main difference between ALB and NLB is the layer at which they operate. ALB operates at the application layer (Layer 7) and is designed for HTTP/HTTPS traffic, while NLB operates at the transport layer (Layer 4) and is designed for TCP traffic. ALB provides advanced routing capabilities based on application-level content, while NLB is optimized for high performance and low latency.
Yes, NLB can see the client IP address because it operates at the transport layer (Layer 4) and forwards the original client IP address to the target instances. This is in contrast to ALB, which uses the X-Forwarded-For header to pass the client IP information.
what’s the Goal of ASG(Auto scaling Group)? Scaling Strategies?
The goal of an Auto Scaling Group (ASG) is to:
Scale out (add EC2 instances) to handle increased load.
Scale in (remove EC2 instances) to reduce excess capacity.
Maintain a specified minimum and maximum number of instances.
Automatically register new instances with a load balancer.
Replace unhealthy instances to ensure reliability.
ASGs *help to optimize capacity and reduce costs by running only the necessary number of instances.
Manual Scaling: Adjust the size of an Auto Scaling Group (ASG) manually as needed.
Dynamic Scaling: Automatically respond to changing demand with different strategies:
Simple / Step Scaling:
Add or remove a specific number of instances when a CloudWatch alarm is triggered (e.g., add 2 units if CPU > 70%, remove 1 unit if CPU < 30%).
Target Tracking Scaling:
Automatically adjust the number of instances to maintain a target metric (e.g., keep the average ASG CPU utilization around 40%).
Scheduled Scaling:
Plan for scaling actions based on known usage patterns (e.g., increase the minimum capacity to 10 at 5 pm on Fridays).
Predictive Scaling:
Utilizes Machine Learning to forecast future traffic.
Automatically provisions the appropriate number of EC2 instances in advance to meet predicted demand.
What’s VPC? Subnets?
VPC = Virtual Private Cloud; private network to deploy your resources (regional resource)
Subnets = allow you to partition your network inside your VPC (Availability Zone resource)
A public subnet is a subnet that is accessible from the internet
A private subnet is a subnet that is not accessible from the internet
To define access to the internet and between subnets, we use route tables.
Network ACL vs Security Groups?
NACL (Network Access Control List):
Acts as a firewall controlling traffic to and from a subnet.
Supports both ALLOW and DENY rules.
Attached at the subnet level.
Rules are based solely on IP addresses.
Security Groups:
A firewall that controls traffic to and from an Elastic Network Interface (ENI) or an EC2 instance.
Only ALLOW rules are supported.
Rules can include IP addresses and other security groups.
How do you monitor the network?
VPC Flow Logs, Subnet Flow Logs, and Elastic Network Interface Flow Logs capture information about IP traffic going into your interfaces, aiding in monitoring and troubleshooting connectivity issues, such as:
Subnets to the internet.
Subnets to subnets.
Internet to subnets.
They also capture network information from AWS managed interfaces like Elastic Load Balancers, ElastiCache, RDS, and Aurora.
VPC Flow Logs data can be sent to S3, CloudWatch Logs, and Kinesis Data Firehose for analysis and storage.
what’s VPC (Virtual Private Cloud)? Subnets? Internet Gateway? NAT Gateway/Instances? NACL (Network Access Control List)? Security Groups? VPC Peering? Elastic IP? VPC Endpoints? PrivateLink? VPC Flow Logs? Site to Site VPN? Client VPN? Direct Connect? Transit Gateway
VPC (Virtual Private Cloud): A private network within the AWS cloud.
Subnets: Partitions of a VPC tied to an Availability Zone (AZ).
Internet Gateway: Provides internet access at the VPC level.
NAT Gateway/Instances: Allow private subnets to access the internet.
NACL (Network Access Control List): Stateless rules for inbound and outbound traffic at the subnet level.
Security Groups: Stateful rules that operate at the EC2 instance level or Elastic Network Interface (ENI).
VPC Peering: Connects two VPCs with non-overlapping IP ranges; non-transitive.
Elastic IP: A fixed public IPv4 address with ongoing costs if not in use.
VPC Endpoints: Provide private access to AWS Services within VPC
PrivateLink: Privately connect to a service in a 3rd party VPC
VPC Flow Logs: Network traffic logs
Site to Site VPN: VPN over public internet between on-premises DC and AWS
Client VPN: OpenVPN connection from your computer into your VPC
Direct Connect: Direct private connection to AWS
Transit Gateway: Connect thousands of VPC and on-premises networks together
what’s Amazon S3 (Simple Storage Service)? what S3 Bucket Policies written by?
Amazon S3 (Simple Storage Service):
A highly scalable, durable, and secure object storage service for storing and retrieving any amount of data.
Often called ‘infinitely scaling’ storage
Offers features like versioning, lifecycle policies, and fine-grained access controls for managing data.
Supports data transfer acceleration and integration with other AWS services for data processing and analytics.
Commonly used for backup and recovery, data archiving, content distribution, and as a data lake for big data analytics.
JSON
Buckets and objects
Permit or Restrict
APIs permitted or restricted
User or account affected by the policy
Use the S3 bucket policy to:
Enable public access to the bucket
Mandate encryption on uploaded objects
Provide access permissions to another account (Cross-Account Access)
what’s Amazon S3 Durability and Availability?
Durability:
Amazon S3 offers high durability of 99.999999999% (11 9's) for objects, ensuring data is protected across multiple Availability Zones.
With this level of durability, if you store 10 million objects with Amazon S3, you can expect to lose a single object once every 10,000 years.
This durability applies to all storage classes.
Availability:
Refers to how readily available a service is for use.
Availability varies depending on the storage class.
For example, S3 Standard has 99.99% availability, which equates to potential unavailability of approximately 53 minutes per year.
what’s Amazon S3 File versioning?
File versioning is available in Amazon S3.
It must be activated at the bucket level.
Overwriting a file with the same key will result in incrementing the version number: 1, 2, 3, etc.
It is considered best practice to enable versioning on your buckets.
Versioning helps guard against accidental deletions by allowing you to restore previous versions.
It simplifies the process of reverting to an earlier version of a file.
Additional details:
Any file not versioned before versioning is enabled will be assigned the version "null".
Suspending versioning will not remove any previously stored versions.
what are different S3 Storage classes? does server automatically encrypt file?
How often can you expect to lose a single object if you store 10 million objects with Amazon S3?
S3 Storage Classes Overview:
Designed for varying use cases based on access frequency, durability, and cost.
Includes options like Standard, Intelligent-Tiering, Standard-IA, and One Zone-IA.
Enables cost optimization through diverse pricing structures.
Supports automatic data migration with lifecycle policies.
Provides global data storage with high availability.
Allows easy transition between classes to meet changing needs.
All S3 Storage Classes:
Amazon S3 Standard - General Purpose
Amazon S3 Standard-Infrequent Access (IA)
Amazon S3 One Zone-Infrequent Access
Amazon S3 Glacier Instant Retrieval
Amazon S3 Glacier Flexible Retrieval
Amazon S3 Glacier Deep Archive
Amazon S3 Intelligent Tiering
Server Side Encryption (Default):
Server encrypts the file after receiving it
d) Once every 10,000 years
How can we Adapt Hybrid Cloud For storage? how to connect S3 data on-premises?
Hybrid Cloud for Storage
Hybrid Cloud Approach: AWS promotes a hybrid cloud model where part of your infrastructure is on-premises and part is on the cloud.
This approach may be due to:
Long cloud migration processes.
Specific security requirements.
Compliance requirements.
Overall IT strategy.
S3 and On-Premises Integration: Given that S3 is a proprietary storage technology, unlike EFS or NFS, the question arises: How do you expose S3 data on-premises?
Solution: AWS Storage Gateway provides a seamless way to connect on-premises environments with S3 storage, enabling hybrid cloud storage solutions!
what are AWS Storage Gateway Types?
AWS Storage Gateway Types:
File Gateway: Integrates on-premises environments with cloud-based storage for file data.
Volume Gateway: Uses block-based storage interfaces compatible with existing applications.
Tape Gateway: Simulates a physical tape library with virtual tape storage in AWS.
what’s Snow Family? OpsHub? Storage Gateway?
Snow Family: A set of physical devices (like Snowball and Snowcone) for importing data onto S3 and performing edge computing.
OpsHub: A desktop application for managing Snow Family devices, simplifying operations like data transfer and device configuration.
Storage Gateway: A hybrid storage solution that extends on-premises storage to S3, facilitating seamless integration between local storage systems and the cloud.
what’s the difference between relational and non-relational databases?
🟢 Relational Databases
Structure: Data stored in tables with rows and columns.
Schema: Fixed schema with predefined structure.
Relationships: Supports complex relationships using foreign keys.
Query Language: Uses SQL.
Examples in AWS: Amazon RDS (supports MySQL, PostgreSQL, Oracle, SQL Server).
🟠 Non-Relational Databases
Structure: Data stored in various formats like key-value pairs, documents, or graphs.
Schema: Dynamic schema allowing flexibility.
Relationships: Generally does not enforce relationships; handles large volumes of unstructured data.
Query Language: Various, depending on type (e.g., MongoDB, Cassandra).
Examples in AWS: Amazon DynamoDB (key-value store), Amazon DocumentDB (document store), Amazon Neptune (graph database).
what are most important types of Relational Databases? what are advantages of using DB Server instead of deploying on EC2?
Amazon RDS
Managed database service employing SQL for queries
Enables creation of cloud databases managed by AWS
Supports various SQL-based databases:
PostgreSQL
MySQL
MariaDB
Oracle
Microsoft SQL Server
IBM DB2
Aurora (AWS proprietary database)
Amazon Aurora
Amazon Aurora = Amazon Aurora is a global-scale relational database service built for the cloud with full MySQL and PostgreSQL compatibility.
provides built-in security, continuous backups, serverless compute, up to 15 read replicas, automated multi-Region replication, and integrations with other AWS services.
It supports both PostgreSQL and MySQL databases.
Aurora is "AWS cloud-optimized" and boasts significant performance enhancements,
claiming a 5x improvement over MySQL on RDS and over 3x the performance of Postgres on RDS.
Aurora's storage automatically scales in increments of 10GB, up to 128 TB.
While Aurora is more efficient, it also costs 20% more than RDS.
Aurora is not included in the free tier.
Amazon Aurora Serverless
Automated database provisioning and dynamic scaling in response to actual usage.Both PostgreSQL and MySQL are compatible with Aurora Serverless DB.
Eliminates the need for capacity planning and reduces management overhead.
Pay-per-second billing offers potential cost savings.
Suitable for irregular, sporadic, or unpredictable workloads.
Advantages of RDS compared to deploying a database on EC2:
RDS is a managed service, offering:
Automated provisioning and OS patching
Continuous backups with point-in-time restore
Monitoring dashboards
Read replicas for enhanced read performance
Multi-AZ setup for disaster recovery
Scheduled maintenance windows for upgrades
Scaling capabilities (vertical and horizontal)
Storage backed by EBS
Note: SSH access to instances is not available with RDS.
what are common types of NoSql Database?
Amazon DynamoDB
Automated key/value database provisioning and dynamic scaling in response to actual usage.
Both PostgreSQL and MySQL are compatible with Aurora Serverless DB.
Eliminates the need for capacity planning and reduces management overhead.
Pay-per-second billing offers potential cost savings.
Is a key/value database:
A key-value database = is a type of non-relational database, also known as NoSQL database, that uses a simple key-value method to store data. It stores data as a collection of key-value pairs in which a key serves as a unique identifier. Both keys and values can be anything, ranging from simple objects to complex compound objects.
Suitable for irregular, sporadic, or unpredictable workloads.
Amazon DynamoDB - Global Tables
Make a DynamoDB table accessible with low latency in multiple-regions
Active-Active replication (read/write to any AWS Region)
Amazon DocumentDB
MongoDB is used to store, query, and index JSON data.
Has similar "deployment concepts" as Amazon Aurora.
Amazon DocumentDB (with MongoDB compatibility):
Fully managed and highly available, with replication across 3 Availability Zones (AZs).
DocumentDB storage automatically grows in increments of 10GB.
Automatically scales to handle workloads with millions of requests per second.
Amazon ElastiCache
Similar to how RDS provides managed relational databases…
ElastiCache offers managed Redis or Memcached services
Caches are high-performance, low-latency in-memory databases
They help alleviate the load on databases for workloads with heavy read operations
AWS handles operating system maintenance and patching, optimizations, setup, configuration, monitoring, failure recovery, and backups
DynamoDB Accelerator - DAX
Fully managed in-memory cache for DynamoDB.
Offers up to a 10x performance improvement, reducing latency from single-digit milliseconds to microseconds.
Provides a secure, highly scalable, and highly available solution.
Difference with ElastiCache at the CCP level:
DAX is exclusively used with and integrated into DynamoDB.ElastiCache can be used with other databases.
what are advanced Non-Relational Databases Amazon offers?
Amazon Redshift:
Redshift is based on PostgreSQL but is not used for Online Transaction Processing (OLTP).
Provides a SQL interface for performing queries.
It is designed for Online Analytical Processing (OLAP) - suitable for analytics and data warehousing.
Typically, data is loaded once every hour, not every second.
Offers 10x better performance than other data warehouses and can scale to petabytes of data.
Amazon Redshift Serverless:
Automatically provisions and scales the underlying capacity of the data warehouse.
Allows running analytics workloads without managing data warehouse infrastructure.
Amazon EMR (Elastic MapReduce):
A cloud big data platform for processing massive amounts of data.
Supports open-source tools such as Apache Hadoop, Apache Spark, HBase, Flink, and Presto.
Simplifies running big data frameworks for processing and analyzing large datasets.
Designed to be cost-effective, scalable, and secure.
Commonly used for data transformation, data processing, and data analytics tasks.
Allows quick setup and configuration of clusters of virtual servers for data processing.
Amazon Athena:
Serverless query service to analyze data stored in Amazon S3.
Uses standard SQL language to query files.
Supports formats such as CSV, JSON, ORC, Avro, and Parquet (built on Presto).
Pricing: $5.00 per TB of data scanned.
Optimize cost by using compressed or columnar data formats (reduces data scanned).
Use cases include business intelligence, analytics, reporting, and analyzing logs such as VPC Flow Logs, ELB Logs, and CloudTrail trails.
Amazon QuickSight:
Serverless machine learning-powered business intelligence service for creating interactive dashboards.
Fast, automatically scalable, and embeddable, with per-session pricing.
Amazon Neptune:
Fully managed graph database service.
Popular for datasets like social networks, where users have friends, posts have comments, comments have likes from users, and users share and like posts.
Highly available across three Availability Zones (AZs), with up to 15 read replicas.
Designed to build and run applications working with highly connected datasets, optimized for complex and challenging queries.
Capable of storing up to billions of relationships and querying the graph with millisecond latency.
Highly available with replications across multiple AZs.
Great for use cases such as knowledge graphs (like Wikipedia), fraud detection, recommendation engines, and social networking.
Amazon TimeStream:
Fully managed, fast, scalable, serverless time series database.
Automatically scales up and down to adjust capacity as needed.Capable of storing and analyzing trillions of events per day.
Offers performance that is 1000s of times faster and costs 1/10th that of traditional relational databases.
Provides built-in time series analytics functions to help identify patterns in data in near real-time.
Amazon Quantum Ledger Database (QLDB):
Amazon Quantum Ledger Database (Amazon QLDB) designed to record financial transactions in a transparent, immutable, and cryptographically verifiable manner.
Fully managed, serverless, highly available, with replication across three Availability Zones (AZs).
Used to review the history of all changes made to your application data over time.
Amazon Managed BlockChain:
Blockchain Technology:
Enables the development of applications where multiple parties can execute transactions without requiring a trusted, central authority.
AWS Glue:
Managed database to extract, transform, and load (ETL) service.
Useful for preparing and transforming data for analytics.
Fully serverless service.