1/45
Question-and-Answer flashcards covering NoSQL motivations, CAP theorem, security, sharding, replication, scaling, and implementation best practices.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What are the five primary reasons modern applications adopt NoSQL databases?
Scalability, flexible schema, quicker/cheaper setup, better performance, and better availability.
Why are NoSQL and relational databases considered complementary rather than mutually exclusive?
Because each excels in different use-cases; combining them lets architects handle diverse workloads more effectively.
What database property is simplified by aggregates in document stores?
Data retrieval—queries are faster because related data is stored and fetched as a single unit.
Define an aggregate in the context of NoSQL databases.
A collection of related data that can be treated and queried as a single unit (e.g., a document).
What does the CAP theorem state?
Of Consistency, Availability, and Partition Tolerance, a distributed system can guarantee only two at the same time.
Which two CAP properties are prioritized in a CP system and give an example?
Consistency and Partition Tolerance; example: MongoDB configured for strong consistency.
Which two CAP properties are prioritized in an AP system and give an example?
Availability and Partition Tolerance; example: Cassandra with eventual consistency.
Which two CAP properties are achievable only in single-node or non-partitioned systems, and what is that combination called?
Consistency and Availability (CA).
In the hotel-room example, what happens if the network link between Perth and Vancouver fails while keeping Perth as master?
Perth can still book the room (A succeeds) but Vancouver cannot; availability for B is lost to maintain consistency.
What is horizontal scaling (scale out) in NoSQL systems?
Adding more nodes to distribute load across the cluster.
How does vertical scaling (scale up) differ from horizontal scaling?
It adds more resources (CPU/RAM/SSD) to a single machine rather than adding nodes.
Name three MongoDB deployment architectures.
Standalone instance, replica set, sharded cluster (which may itself use replica sets).
What is sharding in MongoDB?
Horizontal partitioning of a collection’s documents across multiple shards in a cluster.
Describe ranged sharding and its ideal workload.
Documents are placed on shards based on shard-key ranges; ideal when region-oriented queries are common.
Give one benefit and one drawback of ranged sharding.
Benefit: Efficient range queries. Drawback: Data skew can occur if key ranges are unevenly accessed.
Describe hashed sharding and its ideal workload.
Documents are assigned to shards using a hash of the shard key, giving uniform distribution; well-suited to time-series or event data with high write volume.
Give one benefit and one drawback of hashed sharding.
Benefit: Even data distribution and high scalability. Drawback: Inefficient range queries.
What is zoned sharding?
A sharding strategy where developers define geographic or business rules so data resides on specific shards close to application servers.
List a benefit and drawback of zoned sharding.
Benefit: Low latency / compliance-friendly data placement. Drawback: Potential data imbalance and increased management complexity.
What are the two main replication models discussed?
Master-slave (primary-secondary) replication and peer-to-peer replication.
In master-slave replication, which node handles writes?
The master (primary) node handles all write operations.
What is a key characteristic of peer-to-peer replication?
All nodes are equal and can handle both reads and writes while synchronizing via a distributed protocol.
Why is indexing important in NoSQL databases?
Proper indexes accelerate frequent search queries and filters, improving performance.
What risk accompanies over-indexing?
Excessive storage overhead and slower writes.
What does ETL stand for, and why is it needed during NoSQL migration?
Extract, Transform, Load; used to restructure relational data into document or key-value formats.
Explain authentication as a security service for databases.
Verifies the identity of users, admins, developers, or software accessing the database.
Give two controls that strengthen password-based authentication.
Enforcing minimum password length/complexity and requiring re-authentication for sensitive actions.
What does access control determine in a database system?
Which entities can access what resources and in what manner (roles and privileges).
Define data confidentiality in a database context.
Ensuring data is not disclosed to unauthorized parties, typically via encryption of data at rest and in transit.
What is data integrity, and how can it be enforced?
Assurance that data remains correct and unaltered; enforced with checksums, message digests, or hashes.
Describe non-repudiation as a security service.
Provides proof that a communication or transaction occurred, often via digital signatures.
List two mechanisms that support database availability.
Data replication across nodes and infrastructure safeguards such as backups or redundant hardware.
How can threat modeling enhance database security?
By proactively identifying and mitigating potential breaches like leakage, denial of service, or unauthorized access.
Contrast document-oriented and key-value stores in terms of data retrieval flexibility.
Document stores support rich queries on fields; key-value stores retrieve data by exact key only.
Which store type—document or key-value—typically achieves higher availability according to the notes?
Document-oriented databases are noted as having high availability, while simple key-value stores may have lower.
What scaling technique is common to both document and key-value databases?
Sharding (horizontal partitioning).
Why can strong consistency be challenging in document databases?
Because distributed writes and replication can introduce delays or conflicts, making immediate global consistency hard.
What tool can operators use to observe NoSQL cluster health and performance?
Monitoring systems (e.g., MongoDB Ops Manager, Atlas monitoring, Prometheus + Grafana).
Which two encryption states must be addressed to secure a cloud-hosted NoSQL database?
Encryption at rest (stored data) and encryption in transit (network traffic).
Why must global platforms consider data-location rules when designing sharding strategies?
To comply with regulations like GDPR and reduce latency by keeping data close to users.
What is horizontal scaling’s primary operational challenge?
Managing and monitoring a distributed set of nodes while ensuring balanced load and fault tolerance.
Name one conflict-resolution strategy for replicated data.
Last-write-wins or vector clocks.
In quorum-based consistency, what is required for a successful write?
A majority of replica nodes must acknowledge the write operation.
How does availability suffer in a CP system during a network partition?
Some requests are denied or delayed to ensure all nodes maintain consistent data.
What is the simplest NoSQL data model mentioned, and what does it store?
Key-value model; stores data as pairs of an identifier (key) and opaque value.
Give one tool-based benefit of hosting MongoDB on Atlas.
Automatic backups, monitoring dashboards, or easy global cluster deployment.