NoSQL Databases & Distributed Systems – Exam Review

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/45

flashcard set

Earn XP

Description and Tags

Question-and-Answer flashcards covering NoSQL motivations, CAP theorem, security, sharding, replication, scaling, and implementation best practices.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

46 Terms

1
New cards

What are the five primary reasons modern applications adopt NoSQL databases?

Scalability, flexible schema, quicker/cheaper setup, better performance, and better availability.

2
New cards

Why are NoSQL and relational databases considered complementary rather than mutually exclusive?

Because each excels in different use-cases; combining them lets architects handle diverse workloads more effectively.

3
New cards

What database property is simplified by aggregates in document stores?

Data retrieval—queries are faster because related data is stored and fetched as a single unit.

4
New cards

Define an aggregate in the context of NoSQL databases.

A collection of related data that can be treated and queried as a single unit (e.g., a document).

5
New cards

What does the CAP theorem state?

Of Consistency, Availability, and Partition Tolerance, a distributed system can guarantee only two at the same time.

6
New cards

Which two CAP properties are prioritized in a CP system and give an example?

Consistency and Partition Tolerance; example: MongoDB configured for strong consistency.

7
New cards

Which two CAP properties are prioritized in an AP system and give an example?

Availability and Partition Tolerance; example: Cassandra with eventual consistency.

8
New cards

Which two CAP properties are achievable only in single-node or non-partitioned systems, and what is that combination called?

Consistency and Availability (CA).

9
New cards

In the hotel-room example, what happens if the network link between Perth and Vancouver fails while keeping Perth as master?

Perth can still book the room (A succeeds) but Vancouver cannot; availability for B is lost to maintain consistency.

10
New cards

What is horizontal scaling (scale out) in NoSQL systems?

Adding more nodes to distribute load across the cluster.

11
New cards

How does vertical scaling (scale up) differ from horizontal scaling?

It adds more resources (CPU/RAM/SSD) to a single machine rather than adding nodes.

12
New cards

Name three MongoDB deployment architectures.

Standalone instance, replica set, sharded cluster (which may itself use replica sets).

13
New cards

What is sharding in MongoDB?

Horizontal partitioning of a collection’s documents across multiple shards in a cluster.

14
New cards

Describe ranged sharding and its ideal workload.

Documents are placed on shards based on shard-key ranges; ideal when region-oriented queries are common.

15
New cards

Give one benefit and one drawback of ranged sharding.

Benefit: Efficient range queries. Drawback: Data skew can occur if key ranges are unevenly accessed.

16
New cards

Describe hashed sharding and its ideal workload.

Documents are assigned to shards using a hash of the shard key, giving uniform distribution; well-suited to time-series or event data with high write volume.

17
New cards

Give one benefit and one drawback of hashed sharding.

Benefit: Even data distribution and high scalability. Drawback: Inefficient range queries.

18
New cards

What is zoned sharding?

A sharding strategy where developers define geographic or business rules so data resides on specific shards close to application servers.

19
New cards

List a benefit and drawback of zoned sharding.

Benefit: Low latency / compliance-friendly data placement. Drawback: Potential data imbalance and increased management complexity.

20
New cards

What are the two main replication models discussed?

Master-slave (primary-secondary) replication and peer-to-peer replication.

21
New cards

In master-slave replication, which node handles writes?

The master (primary) node handles all write operations.

22
New cards

What is a key characteristic of peer-to-peer replication?

All nodes are equal and can handle both reads and writes while synchronizing via a distributed protocol.

23
New cards

Why is indexing important in NoSQL databases?

Proper indexes accelerate frequent search queries and filters, improving performance.

24
New cards

What risk accompanies over-indexing?

Excessive storage overhead and slower writes.

25
New cards

What does ETL stand for, and why is it needed during NoSQL migration?

Extract, Transform, Load; used to restructure relational data into document or key-value formats.

26
New cards

Explain authentication as a security service for databases.

Verifies the identity of users, admins, developers, or software accessing the database.

27
New cards

Give two controls that strengthen password-based authentication.

Enforcing minimum password length/complexity and requiring re-authentication for sensitive actions.

28
New cards

What does access control determine in a database system?

Which entities can access what resources and in what manner (roles and privileges).

29
New cards

Define data confidentiality in a database context.

Ensuring data is not disclosed to unauthorized parties, typically via encryption of data at rest and in transit.

30
New cards

What is data integrity, and how can it be enforced?

Assurance that data remains correct and unaltered; enforced with checksums, message digests, or hashes.

31
New cards

Describe non-repudiation as a security service.

Provides proof that a communication or transaction occurred, often via digital signatures.

32
New cards

List two mechanisms that support database availability.

Data replication across nodes and infrastructure safeguards such as backups or redundant hardware.

33
New cards

How can threat modeling enhance database security?

By proactively identifying and mitigating potential breaches like leakage, denial of service, or unauthorized access.

34
New cards

Contrast document-oriented and key-value stores in terms of data retrieval flexibility.

Document stores support rich queries on fields; key-value stores retrieve data by exact key only.

35
New cards

Which store type—document or key-value—typically achieves higher availability according to the notes?

Document-oriented databases are noted as having high availability, while simple key-value stores may have lower.

36
New cards

What scaling technique is common to both document and key-value databases?

Sharding (horizontal partitioning).

37
New cards

Why can strong consistency be challenging in document databases?

Because distributed writes and replication can introduce delays or conflicts, making immediate global consistency hard.

38
New cards

What tool can operators use to observe NoSQL cluster health and performance?

Monitoring systems (e.g., MongoDB Ops Manager, Atlas monitoring, Prometheus + Grafana).

39
New cards

Which two encryption states must be addressed to secure a cloud-hosted NoSQL database?

Encryption at rest (stored data) and encryption in transit (network traffic).

40
New cards

Why must global platforms consider data-location rules when designing sharding strategies?

To comply with regulations like GDPR and reduce latency by keeping data close to users.

41
New cards

What is horizontal scaling’s primary operational challenge?

Managing and monitoring a distributed set of nodes while ensuring balanced load and fault tolerance.

42
New cards

Name one conflict-resolution strategy for replicated data.

Last-write-wins or vector clocks.

43
New cards

In quorum-based consistency, what is required for a successful write?

A majority of replica nodes must acknowledge the write operation.

44
New cards

How does availability suffer in a CP system during a network partition?

Some requests are denied or delayed to ensure all nodes maintain consistent data.

45
New cards

What is the simplest NoSQL data model mentioned, and what does it store?

Key-value model; stores data as pairs of an identifier (key) and opaque value.

46
New cards

Give one tool-based benefit of hosting MongoDB on Atlas.

Automatic backups, monitoring dashboards, or easy global cluster deployment.