ADB Lecture 7

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/10

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 7:11 AM on 6/6/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

11 Terms

1
New cards

What is Big Data?

It refers to datasets that are too large or complex for traditional databases to handle efficiently.

2
New cards

What are the Key Characteristics of Big Data? (5 Vs)

  • Volume → Huge size (TBs–PBs)

  • Velocity → Fast data generation (real-time streams)

  • Variety → Structured + unstructured (text, video, logs)

  • Veracity → Data uncertainty

  • Value → Extracting useful insights

3
New cards

What is an example for Key-Value databases?

Redis ( Key —> Value)

4
New cards

What is an example for Document databases?

MongoDB (Stores JSON-like documents)

5
New cards

What is an example for Column Family databases?

Apache Cassandra (Stores data in columns grouped into families.)

6
New cards

What is an example for Graph Databases?

Neo4j (Nodes (entities), Edges (relationships))

7
New cards

What are the Limitations of Traditional RDBMS?

Vertical scaling only (scale-up is expensive)

Rigid schema (not flexible for changing data)

Poor performance with unstructured data

Join-heavy queries become slow

Not designed for distributed environments

8
New cards

The CAP Theorem can only guarantee 2 out of 3:

Consistency (C) → All nodes see same data

Availability (A) → Every request gets response

Partition Tolerance (P) → Works despite network failure

9
New cards

What are the Distributed Database Concepts?

Sharding → Split data across nodes

Replication → Copy data for fault tolerance

Fault Tolerance → System continues if nodes fail

Consistency Models

  • Strong consistency

  • Eventual consistency

10
New cards

MapReduce Model (Distributed model) has two phases:

Map: converts data into key-value pairs

Reduce: combine counts

11
New cards

What are the core components of Hadoop Ecosystem?

HDFS → Distributed storage

MapReduce → Processing engine

YARN → Resource management