1/94
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Unmanaged services
Scaling, fault tolerance, and availability are managed by you
Unmanaged services (provisioned in…)
Provisioned in discrete proportions specified by user
Manage how the service responds to changes in load, errors, and situations where resources become unavailable
Benefit to Unmanaged Services
You have more fine-tuned control over how your solution handles changes in load, errors, and situations where resources become unavailable
Managed Services Examples
Scaling, fault tolerance, and availability are typically built into the service (handled automatically and internally by Amazon S3)
Managed Services
Require the user to configure them
Require less configuration
Challenges of Relational Databases
Server maintenance and energy footprint
Software installation and patches
Database backups and high availability
Limits on scalability
Data security
Operating system (OS) installation and patches
What are you responsible for when you run your own relational databases?
You are responsible for administrative tasks, such as server maintenance and energy footprint, software, installation and patching, and database backups.
What are you also responsible for ensuring?
High availability, planning for scalability, data security, and operating system (OS) installation patching
Amazon RDS
Managed service that sets up and operates a relational database in the cloud
RDS addresses the challenges of running an unmanaged, standalone relational database through…
Service that sets up, operates, and scales the relational database without any ongoing administration
Provides cost-efficient and resizable capacity, while automating time-consuming administrative tasks
What does RDS enable?
You can focus on your applications
You can give the applications the performance, high availability, security and compatability they need
Your primary focus is your data and optimizing your application
On-premises database
Application optimization
Scaling
High Availability
Database backups
Database software patches
Database software installs
Operation system patches
Operating system Install
Server maintenance
Rack and stack servers
Power, HVAC, network
Databse in Amazon EC2 → AWS Provides
Operating system Install
Server maintenance
Rack and stack servers
Power, HVAC, network
Database in Amazon RDS or Amazon Aurora → AWS Provides
Scaling
High Availability
Database backups
Database software patches
Database software installs
Operation system patches
Operating system Install
Server maintenance
Rack and stack servers
Power, HVAC, network
Database is on prem…
Administrator is responsible for everything (all tasks, none provided by AWS)
Database runs on Amazon EC2…
No longer need to manage the underlying hardware or handle data center operations
Still responsible for patching OS and handling all software and backup operations
Database on Amazon RDS or Amazon Aurora…
Reduce your administrative responsibilities
Automatically scale your database
Enable high availability, manage backups, and perform patching
What do you manage? (with RDS)
Application optimization
What does AWS manage? (with RDS)
OS installation and patches
Database software installation and patches
Database backups
High Availability
Scaling
Power and racking/stacking servers
Server Maintenance
What does RDS PRIMARILY do that helps with the operational side of your workload?
RDS reduces your operational workload and the costs that are associated with your RDS
Amazon RDS DB Instances (2 major parts)
DB Instance Class
DB Instance Storage
DB Instance Class
CPU
Memory
Network performance
DB Instance Storage
Magnetic
General Purpose (SSD)
Provisioned IOPs
What is the basic building block of Amazon RDS?
Database instance
What is a database instance?
Isolated database environment that can contain multiple user-created databases
Can be accessed by using the same tools and applications that you use with a standalone database instance
Resources in a database instance are determined by…
Database instance class
Type of storage is dictated by the…
Type of disks
Database instances and storage differ in performance characteristics and price, enabling you…
To customize your performance and cost to the needs of your database
When you choose to create a database instance…
Must first specify which database engine to run
Amazon RDS currently supports what 6 databases?
MySQL
Amazon Aurora
Microsoft SQL Server
PostgreSQL
MariaDB
Oracle
How can you run an instance (what do you use) and what do you get control over?
Amazon VPC
Control over your virtual networking environment
Process for database instances through VPC
instance is isolated in a private subnet and is only made directly accessible to indicated application instances
Subnets in a VPC are associated with a single AZ, so when you select the subnet, you are also choosing an AZ for the database instance
Most powerful feature of Amazon RDS
Ability to configure your database instance for high availability with a Multi-AZ deployment
Steps for configuring the database instance
After a Multi-AZ deployment is configured, Amazon RDS automatically generates a standby copy of the database instane in another AZ within the same VPC
After seeding the database copy, transactions are synchronously replicated to the standby copy
Running a DB instance in a Multi AZ deployment can enhance availability during planned system maintenance and it can help protect your databases against database instance failure and AZ disruption
And so, if the main database instance falls in a Multi-AZ deployment…
RDS auto brings the standby instance online as the new main instance
This minimizes the potential for data loss
Amazon RDS read replicas Features
Offers asynchronous replication
Reduce the load on your source database instance by routing read quieries from your applications to the read replica
Scale out beyond the capacity constraints of a single database instance
Can be promoted to primary if needed (requires manual action)
Amazon RDS read replicas Functionality
Use for read-heavy database workloads
Offload read queries
Help satisfy disaster recovery requirements
Reduce latency by directing reads to a read replica that is closer to the user
Use cases of RDS
Web and mobile applications
Ecommerce Applications
Mobile and online games
Web and Mobile applications
RDS does not have any licensing constraints, it fits the variable usage pattern of these applications
High throughput
Massive storage scalability
High availability
Ecommerce Applications
Provides a flexibly, secure, and low-cost database solution for online sales and retailing
Low-cost database
Data security
Fully managed solution
Mobile and online games
Requires a database platform with high throughput and availability
Rapidly grow capacity
Automatic Scaling
Database monitoring
RDS manages the database infrastructure, so game devs do not need to worry about provisioning, scaling, or monitoring database servers
Use Amazon RDS when your application requires:
Complex transactions or complex queries
A medium to high query or write rate –Up to 30,000 IOPS (15,000 reads + 15,000 writes)
No more than a single worker node or shard
High durability
Do not use Amazon RDS when your application requires:
Massive read/write rates (for example, 150,000 write/second)
Sharding due to high data size or throughput demands
Simple GET or PUT requests and queries that a NoSQL database can handle
Relational database management system (RDBMS) customization
What should be considered for situations where Amazon RDS is not used?
Using a NoSQL database solution (DynamoDB)
Running your relational database engine on Amazon EC2 instances instead of RDS
What can be used to estimate the cost of Amazon RDS?
Clock-hour billing
Database characteristics
DB purchase type
Number of DB Instances
Provisioned Storage
Additional Storage
Requests
Deployment Type
Data Transfer
Amazon RDS: Clock-hour billing
Clock Hours of Service Time — Resources that incur charges when running
Example: Time you launch a database instance until you terminate the instance
The physical capacity of the database you choose will affect how much you are charged. What are they?
Database characteristics
Physical capacity of database:
Engine
Size
Memory Class
Amazon RDS: DB Purchase Type
On-Demand Instances
Compute capacity by the hour, with no required minimum commitments
Reserved Instances
Low, one-time, upfront payment for database instances that are reserved with a 1-year or 3-year term
Number of DB Instances —
Provision multiple DB instances to handle peak loads
Provisioned Storage
No Charge
Backup storage of up to 100% of provisoned database storage for an active database
Charge (GB per month)
Backup storage for terminated DB Instances
Additional Storage
Backup storage in addition to the provisioned storage amount, billed per GB, per month
Amazon RDS: Deployment type and data transfer
What are the 3 parts?
Requests
Deployment Type
Data Transfer
Requests
Number of input and output requests that are made to the database
Deployment Type
Single Availability Zone
Multiple AZs
Deploy your DB instance to a single AZ (standalone data center) or to multiple AZs (analagous to a secondary data center for enhanced availability and durability)
Storage and I/O charges vary, depending on the number of AZs that you deploy to
Data Transfer
Inbound data transfer is free
Outbound data transfer costs are tiered
Reserved Instances
Help in optimizing the costs for Amazon RDS database instances by purchasing these instances
What is Reserved Instances payment style?
Make a low, one-time payment for each instance that you want to reserve
Recieve a significant discount on the hourly usage charge for that instance
Relational Database (RDB)
Works with structured data that is organized by tables, records, and columns.
RDBs establish a well defined relationship between database tables.
Could have difficulties scaling out horizontally or working with semistructured data, might require many joins for normalized data
What language do RDBs use?
SQL (structured query language)
Standard user application that provides a programming interface for database interaction
Non-Relational Database
Any database that does not follow the relational model that is provided by traditional relational database management systems (RDBMs).
Grown in popularity because they were designed to overcome the limitiations of relational databases
Handle the demands of variable structured data
ARE ABLE TO SCALE OUT HORIZONTALLY
Can work with unstructured and semistructued data
Amazon DynamoDB
Fast and flexible NoSQL database service for all applications that need consistent, single-digit milisecond latency at any scale
Amazon DynamoDB Benefits
NoSQL Database tables
Virtually unlimited storage
Items can have differing attributes
Low-latency queries
Scalable read/write throughput
Review slide 36
DynamoDB info
Amazon DynamoDB core components
Tables, items, and attributes are the core DynamoDB comments
DynamoDB supports two different kinds of primary keys:
Partition key
Parition and sort key
Table, Items, Attributes
Table is a collection of data
Items are a group of attributes that is uniquely identifiable among all the other items
Attributes are a fundamental data element, something that does not need to be broken down any further
Partition Key
Simple primary key
Composed of one attribute called the sort key
Uniquely identifies each record
Compound Key - Partition key and sort key
Composite primary key
Composed of two attributes
Used for sorting data
Partitioning
As data grows, table data is partitioned and indexed by the primary key
You can retrieve data from a DynamoDB table in two different ways:
Query operation takes advantage of partitioning to effectively locate items by using the primary key
Scan → Enables you to locate items in the table by matching conditions on non-key attributes.
Gives you the flexibility to locate items by other attributes
Operation is less efficient because DynamoDB will scan through all the items in the table to find ones that match your criteria
For accessibility, partitioning allows…
Large tables to be scanned and queried quickly
As data grows, table is partitioned key. QUERY by Key to find items by any attribute
Globally Unique Identifier (GUID)
Simple primary key that is based on a single attribute of the data values with a uniform distribution
Used to uniquely identify items in the DynamoDB Table
Key Info about DynamoDB
Runs exclusively on SSDs
Replicates your table across AWS Regions
Works well for Mobile, web, gaming, AdTech
Accessible via console, CLI, API
Has no limits on table size and has high latency
DynamoDB Global Tables
Reduces the work of replicating data between regions and resolves update conflicts
Helps apps stay available and replicates the table across AWS Regions
Amazon RedShift
Fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data by using standard SQL and your existing business intelligence (BI) tools
What does Amazon RedShift enable?
Run complex analytic queries against petabytes of structured data by using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel data processing.
Parallel Processing architecture
Leader node manages communications with client programs and all communications with compute nodes
Parses and develops plans to carry out database operations
a) Series of steps that are needed to obtain results for complex queries
Compiles code for individual elements of the plan and assigns the code to individual compute nodes
Compute nodes run the compiled code and send intermediate results back to the leader node for final aggregation
What is the pricing model?
What is the lowest rate/get started for Amazon Redshift?
How long can Redshift deliver storage and processing for?
Pay for what you use
25 cents per hour
$1000 dollars per TB per year (3-year Upfront Reserved Instance pricing)
Things that can be automated for the Amazon Redshift cluster:
Manage
Monitor
Scale
These help you focus on your data and your business.
Scalability and Security
Scalability is intrinsic in Amazon Redshift
Your cluster can be scaled up and down as your needs change with a few clicks in the console
Security is highest priority for AWS
With Redshift, security is built in, and it is designed to provide strong encryption of your data both at rest and in transit
Compatibility
Compatible with the tools that you already know and use.
Amazon Redshift supports standard SQL
Provides high-performance Java Database Connectivity (JDBC) and Open Database connectivity (ODBC) connectors
These enable you to use the SQL clients and BI tools of your choice
Amazon Redshift use cases
Enterprise data warehouse (EDW)
Big data
Software as a Service (SaaS)
Enterprise Data Warehouse
Migrate at a pace that customers are comfortable with
Experiment without large upfront cost or commitment
Respond faster to business needs
Big Data
Low price point for small customers
Managed service for ease of deployment and maintenance
Focus more on data and less on database management
Software as a Service (SaaS)
Scale the data warehouse capacity as demands grows
Add analytic functionality to applications
Reduce hardware and software costs
Key benefits of AWS Redshift
Easily scale with no downtime
Columnar storage and parallel processing architectures
Parallelize and distribute data and queries across multiple nodes
Automatically and continuously monitors clusters
Encryption inbuilt
What is Amazon Aurora?
MySQL-and PostgreSQL-compatible relational database that isbuilt for the cloud
Features/Benefits of Amazon Aurora
Enterprise-class relational database
Compatible with MySQL or PostgreSQL
Automate time-consuming tasks (such as provisioning, patching, backup, recovery, failure detection, and repair)
What does Aurora combine and what other features does it have?
Combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases
Can reduce databases costs while improving reliability and availability of the database
Amazon Aurora Service Benefits
Fast and Available
a) Highly available and it offers a fast distributed storage subsystem
Managed Service
a) Integrates with features such as AWS Database Migration Service (AWS DMS) and AWS Schema Conversion Tool
b) Move your dataset into Amazon Aurora
Simple
Compatible
a) Has compatibility with MySQL and PostgreSQL database engines so that you can use most of your existing database tools with little or no change
Pay-as-you-go
a) You only pay for the services and features that you use
Major reason to use Amazon Aurora over other options (like SQL with RDS) is…
High Availability (resilient design)
High Availability
Stores multiple copies of your data across multiple AZs with continuous backups to Amazon S3
Up to 15 read replicas can be used to reduce the possibility of losing your data
Designed for instant crash recovery if your primary database becomes unhealthy
Resilient Design
Does not need to replay the redo log from the last database checkpoint
Performs a redo on every read operation
Reduces the restart time after a database crash to less than 60 seconds in most crashes
How does Aurora make the data so resilient?
Buffer cache is moved out of the database process, which makes it available immediately at restart
Reduces the need for you to throttle access until the cache is repopulated to avoid brownouts
More Benefits of Amazon Aurora
High performance and high scalability
High availability and durability
Fault-tolerant and self-healing storage
Multiple layers of security
Network Isolation, Encryption at rest, encryption of data in transit
Compatible with MySQL and Post-greSQL
Fully managed by RDS
Right Tool for the job
Amazon RDS
Amazon DynamoDB
Databses on Amazon EC2
AWS purpose-built database services
Amazon RDS → Enterprise-class relational database
Amazon DynamoDB → Fast and flexible NoSQL database service for any scale
Databases on Amazon EC2 → Operating system access or application features that are not supported by AWS Database services
AWS purpose-built database services → Specific case-driven requirements (ML, data warehouse, graphs)