NOSQL Databases and DynamoDB:

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions
Get a hint
Hint

DynamoDB Architecture

1 / 24

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

25 Terms

1

DynamoDB Architecture

  • NoSQL database as a service products, its public database as a service (DBaas)- wide column key/value and document. No self managed servers or infrastructure. Supports range is scaling options- manual/automatic provisioned performance IN/OUT or on demand. Can also be highly resistant across AZs and optionally global. It’s really fast. Supports backups, point in time recovery, encryption at rest. Supports event driven integration (do things when data changes)

    • Tables: base entity of dynamoDB. It is a grouping of items with the same primary key. No limit to number of items in a table. A Primary key can be a simple (partition) or composite (partition and sort) primary keys. Each item must have a unite value. Can have none, all, mixture, or different attributes (no right scheme). Item max 400KB (this is speed not space)

<ul><li><p>NoSQL database as a service products, its public database as a service (DBaas)- wide column key/value and document. No self managed servers or infrastructure. Supports range is scaling options- manual/automatic provisioned performance IN/OUT or on demand. Can also be highly resistant across AZs and optionally global. It’s really fast. Supports backups, point in time recovery, encryption at rest. Supports event driven integration (do things when data changes)</p><ul><li><p>Tables: base entity of dynamoDB. It is a grouping of items with the same primary key. No limit to number of items in a table. A Primary key can be a simple (partition) or composite (partition and sort) primary keys. Each item must have a unite value. Can have none, all, mixture, or different attributes (no right scheme). Item max 400KB (this is speed not space)</p></li></ul></li></ul>
New cards
2

DynamoDB backups

  • On demand: full copy of table retained until removed. Can be used for same/cross region restoration. Can adjust encryption and if its with/without indexes when restored

    • Your responsible for performing and removing old backups

  • Point in time recovery: not enabled by default. When enabled it results in continuous stream of backups for a 35 day window. You can resort any one second granular backup 

New cards
3

DynamoDB key points

  • NoSQL == DynamoDB (NEVER relational data)

  • Key/value == preference DynamoDB

  • Accessed via console, CLI, API, NEVER SQL

  • Billing is based on RCU, WCU, storage and features

New cards
4

DynamoDB- Reading and writing: 

  • On demand: unknown, unpredictable and low admin for table. No need to set specific capacity setting. You pay on demand for the R or W units (typically more expensive)

  • Provisions: you set RCU and WCU set on a per table basis. 

  • > every operation consumes at least 1 RCU or WCU

  • > 1 RCU is 1 x 4KB read operation per second 

  • > 1 WCU is 1 x 1KB write operation per second

  • > Even table has a RCU and WCU burst pool (300 second)

New cards
5

DynamoDB- Operations: Query

  • Query: way to retrieve data from product

    • You need to pick a partition key (blue). Query accepts a single PK value and optionally a SK or range. Capacity consumed is the size of all returned items. Further filtering discards data- capacity is still consumed!! Can ONLY query on PK or PK and SK

      • Best to combine operations single operation (in the example, if you were to spit the two PK==1 operations you would have consumed 1 RCU since it rounds up for each, totaling 2 RCU)

<ul><li><p>Query: way to retrieve data from product</p><ul><li><p>You need to pick a partition key (blue). Query accepts a single PK value and optionally a SK or range. Capacity consumed is the size of all returned items. Further filtering discards data- <strong>capacity is still consumed!!</strong> Can ONLY query on PK or PK and SK</p><ul><li><p>Best to combine operations single operation (in the example, if you were to spit the two PK==1 operations you would have consumed 1 RCU since it rounds up for each, totaling 2 RCU)</p></li></ul></li></ul></li></ul>
New cards
6

DynamoDB- Operations: Scan

  • Scan: least efficient in getting data but its more flexible. IT move through the table consuming capacity of every ITME. You have control on what data is selected, any attributes can be used and filters applied but SCAN consumes capacity for every time scanned through. 

<ul><li><p>Scan: least efficient in getting data but its more flexible. IT move through the table consuming capacity of every ITME. You have control on what data is selected, any attributes can be used and filters applied but SCAN consumes capacity for every time scanned through.&nbsp;</p></li></ul>
New cards
7

DynamoDB- Operations: Consistency model: 

  • how when data is updated or when new data is written to the database and then immediacy read is that data immediacy the same or only eventually the same. 

    • In dynamoDB all data is replaced to separate AZ. Each is a “storage node” one of them is a leader node. Writes are always directed to the leader node. The leader node is “consistent”. The leader node then starts process of replication of data to the other nodes 

    • Eventually consistent read: If you do a read, you are directed to one of the nodes at random. If the data is not yet consistent you pay less for the read (half the price)

    • Strongly consistent: when you do a read, you are taken to the must up to dat copy (leader node)

New cards
8

DynamoDB- operations cost issue

  • indexes: improve efficiently of query data. 

  • Queries are most efficient operation in DDB, but it can only work on 1 PK value at a time (optionally single or range of SK values)/ indexes are an alternative views on the table. You can get view using SK (LSI) or different PK and SK (GSI). When creating both indexes you have the ability to choose which attributes are projected (some/all). 

New cards
9

DynamoDB- Local secondary indexes (LSI):

  • alternative view for a table. It MUST be created with the table, cannot be made after the table is made. You can have 5 LSI’s per base table. It has the SAME PK but alternative SK on the sale. It shares the RCU and WCU with the table. When picking attributes, you can chose to have all, Keys only and include. 

    • If you want ONLY a specific attribute, that attribute can be used as the SK. 

    • Capacity shared with the table

<ul><li><p>alternative view for a table. <strong>It MUST be created with the table, cannot be made after the table is made.</strong> You can have 5 LSI’s per base table. <strong>It has the SAME PK but alternative SK on the sale. It shares the RCU and WCU with the table.</strong> When picking attributes, you can chose to have all, Keys only and include.&nbsp;</p><ul><li><p>If you want ONLY a specific attribute, that attribute can be used as the SK.&nbsp;</p></li><li><p>Capacity shared with the table</p></li></ul></li></ul>
New cards
10

DynamoDB- Global secondary index (GSI):

  • can be created at any time after the tables creation. Default limit of 20 per base table. You can choose both an alternate PK and SK. GSI’s have wither own RCU and WCU allocations. You chan choose what attributes are displayed (same as LSI)

    • Always eventually consistent, relation between base and GSI is asynchronous 

    • Own capacity allocation

<ul><li><p><strong>can be created at any time after the tables creation</strong>. Default limit of 20 per base table. <strong>You can choose both an alternate PK and SK. GSI’s have wither own RCU and WCU allocations</strong>. You chan choose what attributes are displayed (same as LSI)</p><ul><li><p>Always eventually consistent, relation between base and GSI is asynchronous&nbsp;</p></li><li><p>Own capacity allocation</p></li></ul></li></ul>
New cards
11

DynamoDB- LSI and GSI exam points

  • Careful with projects (all, Keys only and include)- you pay for capacity 

  • Queries on attributes not projected are expensive

  • GSI as default over LSI (LSI is better for strong consistency requirement)

  • Use indexes for alternative access patterns 

New cards
12

DynamoDB stream

  • is a time ordered list of item changes in a table. It’s a 24 hour rolling window. You need to enable it on a per table bases. Records are inserts, updates, or deletes. Different view types influence what is in the stream 

    • Streams can be configured with the following view types: 

      • Keys Only: stream will only record PK and any applicable SK changes

      • New image: stores entire item AFTER to change

      • Old image: stores entire item PRIOR to change

      • New and old images: shows full visibility- both pre and post change of image 

New cards
13

DynamoDB Trigger

  • allows actions to take place in the event of a change in data. 

    • The event contains the data which changes. An action is taken using the data. AWS = streams + lambda(trigger)

    • DynamoDB global tables: provide multi master cross region replication (read and write for all global tables). Tables are made in multiple regions and added to the same global tables (becoming replica tables). Follows last writer wins is used for conflict resolution (recent overwrites). Reads and writes can occur in any region and there is sub second replication between regions. Its sternly consistent reads only in the same region as writes (other regions are eventually consistent

  • Provides global HA and Global DR/BC

New cards
14

DynamoDB accelerator (DAX): 

  • in memory cache for DynamoDB- integrated in DynamoDB. 

    • Traditional cache vs DAX

      • traditional: application goes to cache, if miss, it must go to the DB and grab data and add to cache, cache is then updated and retried data is now a hit

      • DAX: removes admin overhead. App makes single call to DAX, if miss, DAX does all the work to return and retrieve data from the DB back to the application. 

<ul><li><p>in memory cache for DynamoDB- integrated in DynamoDB.&nbsp;</p><ul><li><p>Traditional cache vs DAX</p><ul><li><p>traditional: application goes to cache, if miss, it must go to the DB and grab data and add to cache, cache is then updated and retried data is now a hit</p></li><li><p>DAX: removes admin overhead. App makes single call to DAX, if miss, DAX does all the work to return and retrieve data from the DB back to the application.&nbsp;</p></li><li><p></p></li></ul></li></ul></li></ul>
New cards
15

Dax is cluster service

nodes are placed in multiple AZ, one being the primary and the others being replicas (which are read replicas). Item cache holds result of batch getItem. 

<p>nodes are placed in multiple AZ, one being the primary and the others being replicas (which are read replicas). Item cache holds result of batch getItem.&nbsp;</p>
New cards
16

DAX Exam points

  • Primary node (supports writes), replicas (read)

  • Nodes are Highly avalbilbe, if primary fails, its replaced 

  • In memory cache- sharing is much faster for reads, less cost

  • Can scale up and scale out (bigger or more)

  • Supports write-through (store data in cache too)

  • DAX deployed within a VPC

  • Good for workloads with heavy reads, want low response time

  • Not ideal for applications that need high consistency 

New cards
17

DyanmoDB TTL

  • TTL lets you define a timestamp for automatic deletion of item. You specific a date and time and its set to ‘expired’. You configure TTL on a specific attribute. 

    • A Per partition process periodically runs, checking the current time (in seconds since epoch) to the value in the TTL attribute. They are set to ‘expired’, then its ran again and if an item is set to expired its actually deleted

New cards
18

Athena

  • serverless interactive query service. Allows you to preform ad-hoc queries on data- pay only for data consumed. Athena uses schema on read-> data stored on S3 never changes, the schema translates the data into a table like structure (relational like when read). Output can be sent to other AWS service. 

    • You have the source data then you define the schema (which the tables). It’s how you want to take source table and convert them to a table. 

<ul><li><p>serverless interactive query service. Allows you to preform ad-hoc queries on data- pay only for data consumed. Athena uses schema on read-&gt; data stored on <strong>S3 never changes</strong>, the schema translates the data into a table like structure (relational like when read). Output can be sent to other AWS service.&nbsp;</p><ul><li><p>You have the source data then you define the schema (which the tables). It’s how you want to take source table and convert them to a table.&nbsp;</p></li></ul></li></ul>
New cards
19

Athena key points

  • Athena has no infrastructure- no need to load data in advance. Best if you dont want to load/transform data. 

  • Best for occasional queries on data in S3

  • Great if cost conscious- and serverless quiering scenarios

  • Best or query of AWS logs (VPC flow logs, cloud trail, ELB logs, cost reports, etc)

  • Can also query data from aws glue data catalog and web server logs

  • feature: Athena federated query- data source connector (code that translates between large data source that isn’t S3 and Athena)

New cards
20

ElastiCache

in memory database for apps that tree high performance. ElastiCache delivers managed Redis or Memcache as a service. Can be used to cache data for read heavy workloads with low latency requirements reduces database workloads (expensive). Can also be used to store session data (stateless servers). Using ElastiCache means you need to make changes to application code! Must know to check/write to cache (NOT FREE). 

<p>in memory database for apps that tree high performance. <strong>ElastiCache delivers managed Redis or Memcache as a service. Can be used to cache data for read heavy workloads with low latency requirements reduces database workloads (expensive). Can also be used to store session data (stateless servers). Using ElastiCache means you need to make changes to application code! Must know to check/write to cache (NOT FREE).&nbsp;</strong></p>
New cards
21

ElastiCache- session state data

 if connected to an instance, session state is written to the instance. ElastiCache also ensures that the session state stays up to date, so if the connection is moved to another instance, the session data is maintained (stateless!!)

<p>&nbsp;if connected to an instance, session state is written to the instance. ElastiCache also ensures that the session state stays up to date, so if the connection is moved to another instance, the session data is maintained (stateless!!)</p>
New cards
22

ElastiCache- Engines

  • both offer sub-millisecond access to data, both support many programming languages

    • Memcached: 

      • simple data structures

      • No replication

      • Mulitple nodes (shading)

      • No backups

      • Multi threaded (better performance)

    • Redis: 

      • advanced structures (can help sore ordered data)

      • Multi AZ

      • Replication (scale reads)

      • Backup and restore 

      • Transactions (can allow all or none to work)

New cards
23

Amazon Redshift architecture:

  • petabyte scale data warehouse (where you can pump data from databases across you business into here for analysis). It’s an OLAP (column based) not OLTP (row/transaction). Redshift is pay as you go similar to RDS.

    • Can be used to query S3 using redshift spectrum

    • Can directly query other DBs using federation query

    • Integrates with AWS tooling such as quick sight

    • SQL like interface JDBC/ODBS connections 

New cards
24

Amazon Redshift architecture

  • Server based, not serverless (unlike Athena) 

  • NOT used ad hoc like Athena since it needs provisioning 

  • Redshift cluster runs with nodes privately, in one AZ. There is a leader node where query input, planning and aggregation, compute node perfumes querying of data. 

  • Since it is a VPC service, you can manage it as such: VPC security, IAM permissions, KMS at rest encryption, CS monitoring

  • feature: Redshift enhance VPC routing-VPC networking 

    • By default it takes public routing but using enhanced VPC routing you can configure specific VPC routing. - customizable networking 

<ul><li><p><strong>Server based</strong>, not serverless (unlike Athena)&nbsp;</p></li><li><p><strong>NOT used ad hoc like </strong>Athena since it needs provisioning&nbsp;</p></li><li><p>Redshift cluster runs with <strong>nodes</strong> privately, in one AZ. There is a leader node where query input, planning and aggregation, compute node perfumes querying of data.&nbsp;</p></li><li><p>Since it is a VPC service, you can manage it as such: VPC security, IAM permissions, KMS at rest encryption, CS monitoring</p></li><li><p><strong>feature: Redshift enhance VPC routing-VPC networking&nbsp;</strong></p><ul><li><p><strong>By default it takes public routing but using enhanced VPC routing you can configure specific VPC routing. - customizable networking&nbsp;</strong></p></li></ul></li></ul>
New cards
25

Redshift resilience and recovery

  • we know Redshift runs in one AZ. There are some recovery features

    • Can take backups. Can be automatic incremental which occurs every 8 hours (anything changed is added to an S3, retention for 1-35 days) can also have manual backups (you manage deletion). Snapshots can be backed up to other AZ (multi AZ) and can also be sued to configure the data in another region if needed 

New cards

Explore top notes

note Note
studied byStudied by 16 people
192 days ago
5.0(1)
note Note
studied byStudied by 53 people
1213 days ago
5.0(2)
note Note
studied byStudied by 44 people
949 days ago
5.0(3)
note Note
studied byStudied by 26 people
858 days ago
4.5(2)
note Note
studied byStudied by 3 people
783 days ago
5.0(1)
note Note
studied byStudied by 76 people
683 days ago
5.0(1)
note Note
studied byStudied by 4 people
185 days ago
4.5(2)
note Note
studied byStudied by 263 people
822 days ago
5.0(1)

Explore top flashcards

flashcards Flashcard (41)
studied byStudied by 5 people
669 days ago
4.0(1)
flashcards Flashcard (47)
studied byStudied by 3 people
704 days ago
5.0(1)
flashcards Flashcard (59)
studied byStudied by 6 people
289 days ago
5.0(1)
flashcards Flashcard (45)
studied byStudied by 41 people
696 days ago
5.0(1)
flashcards Flashcard (189)
studied byStudied by 3 people
585 days ago
5.0(1)
flashcards Flashcard (74)
studied byStudied by 5 people
687 days ago
5.0(2)
flashcards Flashcard (65)
studied byStudied by 10 people
493 days ago
5.0(1)
flashcards Flashcard (35)
studied byStudied by 7 people
61 days ago
5.0(1)
robot