Ch. 3 Streaming Data Collection

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/28

flashcard set

Earn XP

Description and Tags

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

29 Terms

1
New cards
What are the different ways you can upload data into S3?
console, S3 API, AWS cli
2
New cards
Kinesis Streams
collects data from data produces & stores the data as shards
3
New cards
Shards
container for streaming data
4
New cards
Shard composition train analogy
train id = partition key, train car = sequence number, passengers = data blob
5
New cards
Retention period of shards?
24 hrs to 7 days
6
New cards
Where are Kinesis Streams shards sent after collection?
sent to data consumers
7
New cards
Data consumers?
data processing tools like EC2, Lambda, Kinesis Data Analytics, EMR w/ Apache Spark
8
New cards
Kinesis Producer Library (KPL)
library that allows you to write to Kinesis Stream
9
New cards
Kinesis Client Library (KCL)
library integrated with KPL for consumer apps to consume & process data from Kinesis Stream
10
New cards
Kinesis APIs
the same function as KPL & KCL, but used for lower level API orientations & manual configuration
11
New cards
What is the difference in processing speed of KPL vs Kinesis API?
KPL is slower compared to Kinesis API because it does things for you behind the scenes
12
New cards
What is the language difference of KPL vs Kinesis API?
KPL is Java only & Kinesis API is whatever languages are supported
13
New cards
What is the difference in automation of KPL vs Kinesis API?
KPL has some automated API calls while Kinesis API is all manual
14
New cards
When should you use Kinesis API?
when you need your data stream in milliseconds
15
New cards
When should you use Kinesis Data Streams consumer wise?
when the data needs to be directly processed by the consumers, instead of to a data store
16
New cards
When should you use Kinesis Data Streams storage wise?
when storing data is optional or data retention is important
17
New cards
Kinesis Data Streams Real World Uses
process & evaluate logs immediately, real time data analytics
18
New cards
Kinesis Firehose
easily stream data to final data store
19
New cards
When should you use Kinesis Firehose collection wise?
when you want to collect streaming data & send to final destination like S3
20
New cards
When should you use Kinesis Firehose processing wise?
when processing data is optional or data retention is not important
21
New cards
Kinesis Firehose Real World Uses
stream & store data from devices, create ETL jobs on streaming data
22
New cards
Kinesis Video Stream
stream videos into AWS
23
New cards
What can you with Kinesis Video Stream?
build real time video procession apps, stream images & audio real time
24
New cards
When should you use Kinesis Video Stream processing wise?
when processing real time streaming video, audio, image, radar
25
New cards
When should you use Kinesis Video Stream batch wise?
batch process & store streaming video, feed streaming data into other AWS services
26
New cards
Kinesis Data Analytics
continuously read & process streaming data in real time via SQL queries to process incoming data & produce output data
27
New cards
When should you use Kinesis Data Analytics SQL wise?
run SQL queries on streaming data & output results into S3
28
New cards
When should you use Kinesis Data Analytics creation-wise?
create metrics, dashboards, monitoring, notifications, alarms, or apps that provide data insight
29
New cards
Kinesis Data Analytics Real World Uses
real time analytics, stream ETL jobs