A: A group of consumers that work together to consume messages from one or more topics, ensuring each message is processed by only one consumer in the group.

Flashcard 17

Q: How does Kafka handle log storage?

A: Messages are stored in append-only, ordered, and immutable logs.

Flashcard 18

Q: What is log compaction in Kafka?

A: A process to clean out obsolete records by retaining only the latest message for each key within a log segment.

Flashcard 19

Q: What are log segments in Kafka?

A: Portions of a topic partition's log stored as directories of segment files.

Flashcard 20

Q: What determines when log segments are deleted in Kafka?

A: Reaching size or time limits as defined by the delete policy.

Flashcard 21

Q: What is Amazon Kinesis?

A: A managed service for ingesting, processing, and analyzing real-time streaming data on AWS.

Flashcard 22

Q: What are Kinesis Data Streams?

A: Services that allow ingestion and processing of streaming data in real-time.

Flashcard 23

Q: What are Firehose Delivery Streams in Kinesis?

A: Services that collect, transform, and load ETL streaming data into destinations like S3, Redshift, and Splunk.

Flashcard 24

Q: What does Kinesis Analytics do?

A: Runs continuous SQL queries on streaming data from Kinesis Data Streams and Firehose Delivery Streams.

Flashcard 25

Q: What are Kinesis Video Streams used for?

A: Streaming live video from devices to the AWS cloud for real-time video processing and batch-oriented analytics.

Flashcard 26

Q: What is AWS IoT?

A: A service for collecting and managing data from Internet of Things (IoT) devices.

Flashcard 27

Q: What is the Device Gateway in AWS IoT?

A: Enables IoT devices to securely communicate with AWS IoT.

Flashcard 28

Q: What is the Device Registry in AWS IoT?

A: Maintains resources and information associated with each IoT device.

Flashcard 29

Q: What is a Device Shadow in AWS IoT?

A: Maintains the state of a device as a JSON document, allowing applications to interact with devices even when they are offline.

Flashcard 30

Q: What does the Rules Engine in AWS IoT do?

A: Defines rules for processing incoming messages from devices.

Flashcard 31

Q: What is Apache Flume?

A: A distributed system for collecting, aggregating, and moving large amounts of data from various sources to a centralized data store.

Flashcard 32

Q: What is a checkpoint file in Flume?

A: Keeps track of the last committed transactions, acting as a snapshot for data reliability.

Flashcard 33

Q: What are the main components of Flume Architecture?

A: Source, Channel, Sink, and Agent.

Flashcard 34

Q: What is a Source in Flume?

A: The component that receives or polls data from external sources.

Flashcard 35

Q: What is a Channel in Flume?

A: Transmits data from the source to the sink.

Flashcard 36

Q: What is a Sink in Flume?

A: Drains data from the channel to the final data store.

Flashcard 37

Q: What is an Agent in Flume?

A: A collection of sources, channels, and sinks that moves data from external sources to destinations.

Flashcard 38

Q: What is an Event in Flume?

A: A unit of data flow, consisting of a payload and optional attributes.

Flashcard 39

Q: Name the types of Flume Channels.

A: Memory channel, File channel, JDBC channel, and Spillable Memory channel.

Flashcard 40

Q: What is a Memory Channel in Flume?

A: Stores events in memory for fast access.

Flashcard 41

Q: What is a File Channel in Flume?

A: Stores events in files on the local filesystem for durability.

Flashcard 42

Q: What is a JDBC Channel in Flume?

A: Stores events in an embedded Derby database for durable storage.

Flashcard 43

Q: What is a Spillable Memory Channel in Flume?

A: Stores events in an in-memory queue and spills to disk when the queue is full.

Flashcard 44

Q: What is Apache Sqoop?

A: A tool for importing data from relational databases into Hadoop Distributed File System (HDFS), Hive, or HBase, and exporting data back to RDBMS.

Flashcard 45

Q: How does Sqoop import data?

A: By launching multiple map tasks to transfer data as delimited text files, binary Avro files, or Hadoop sequence files.

Flashcard 46

Q: What are Hadoop Sequence Files?

A: A binary file format specific to Hadoop for storing sequences of key-value pairs.

Flashcard 47

Q: What is Apache Avro?

A: A serialization framework that provides rich data structures, a compact binary data format, and container files for data storage and processing.

Flashcard 48

Q: What is RabbitMQ?

A: A messaging queue that implements the Advanced Message Queuing Protocol (AMQP) for exchanging messages between systems.

Flashcard 49

Q: What is the Advanced Message Queuing Protocol (AMQP)?

A: A protocol that defines the exchange of messages between systems, specifying roles like producers, consumers, and brokers.

Flashcard 50

Q: In RabbitMQ, what are producers and consumers?

A: Producers publish messages to exchanges, and consumers receive messages from queues based on bindings and routing rules.

Flashcard 51

Q: What is ZeroMQ?

A: A high-performance messaging library that provides tools to build custom messaging systems without requiring a message broker.

Flashcard 52

Q: What messaging patterns does ZeroMQ support?

A: Request-Reply, Publish-Subscribe, Push-Pull, and Exclusive Pair.

Flashcard 53

Q: What is RestMQ?

A: A message queue based on a simple JSON-based protocol using HTTP as the transport, organized as REST resources.

Flashcard 54

Q: How do producers interact with RestMQ?

A: By making HTTP POST requests with data payloads to publish messages to queues.

Flashcard 55

Q: What is Amazon SQS?

A: A scalable and reliable hosted queue service that stores messages for distributed applications.

Flashcard 56

Q: What are the two types of queues in Amazon SQS?

A: Standard queues and FIFO (First-In-First-Out) queues.

Flashcard 57

Q: What are the characteristics of Standard Queues in Amazon SQS?

Guarantees message delivery but not order.
Supports unlimited transactions per second.
Operates on an at-least-once delivery model, occasionally delivering duplicate messages.

Flashcard 58

Q: What are the characteristics of FIFO Queues in Amazon SQS?

Ensures messages are received in the exact order they were sent.
Supports up to 3,000 messages per second with batching or 300 messages per second without batching.
Provides exactly-once processing.