System Design Concepts

0.0(0)

Studied by 8 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/44

Earn XP

Description and Tags

These flashcards cover key terminology and concepts related to system design, focusing on scaling, databases, message queuing, caching, and consistency models.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

45 Terms

New cards

Horizontal Scaling

Scaling Out by adding more machines (e.g., servers) in a distributed system.

New cards

Vertical Scaling

Scaling Up by adding more power (CPU, RAM) to a single machine.

New cards

Load Balancer

A device that distributes incoming traffic across multiple servers to improve fault tolerance and performance.

New cards

Relational Database (SQL)

A database to use when you need strong consistency, ACID transactions, and structured data.

New cards

NoSQL Database

A type of database suitable for high scalability and flexible schema, often used for unstructured data.

New cards

Message Queue

A service used to decouple services and enable asynchronous processing.

Dead Letter Queue (DLQ): 💯 Nice touch! DLQs store failed messages so you can inspect or retry them later, improving fault tolerance.

New cards

Caching

A way to improve system performance by storing frequently accessed data in memory.

New cards

CAP Theorem

A principle stating that a distributed system can provide only two out of three guarantees: Consistency, Availability, and Partition Tolerance.

New cards

Eventual Consistency

A model where data updates propagate over time without guaranteeing immediate consistency.

New cards

Sharding

The process of splitting a large database into smaller partitions ('shards') to enhance scalability and performance.

New cards

CDN (Content Delivery Network)

A network of servers that caches content to improve access speed and reduce latency for users.

New cards

API Gateway

A single entry point for managing multiple microservices, providing routing, authentication, and security functions.

Rate limiting – Protects services from abuse (e.g., 100 req/min/user).
Request/response transformation – Format or modify payloads between clients and services.
Load balancing – Distributes incoming traffic across healthy instances.
Caching – Reduces load on backend services by storing responses temporarily.
Monitoring/logging – Centralized logging and metrics (great for debugging and observability).
Service discovery integration – Works with service registries like Consul or Eureka to find service instances.

New cards

Strong Consistency

A consistency model where every read returns the most recent write, ensuring no stale data is returned.

New cards

Eventual Consistency

A consistency model where data can be temporarily stale but will be consistent over time.

New cards

CP (Consistency + Partition Tolerance)

Sacrifices availability
If the network is partitioned, the system will reject requests to avoid returning stale data.
✅ Used in systems like banking, inventory, financial trading.

New cards

AP (Availability + Partition Tolerance)

Sacrifices consistency
System always responds, but may serve stale data until partitions heal.
✅ Used in social media, DNS, caching systems.

New cards

CA (Consistency + Availability) — ⚠ Not achievable in distributed systems if a partition occurs.

In theory only — when no network failures exist.
In real-world distributed systems, partition tolerance is a must, so this combination doesn’t exist in practice.

New cards

Benefits of a message queue

Decoupling: Producer and consumer don’t need to be online at the same time.
Scalability: You can scale consumers horizontally to process large volumes.
Resilience: If a consumer goes down, messages are still stored in the queue.
Backpressure handling: Prevents services from being overwhelmed.

New cards

Why use sharding

Improves scalability: Each shard handles a subset of data, preventing a single DB from becoming a bottleneck.
Enhances performance: Queries run faster because they search in a smaller dataset.
Supports high availability: If one shard fails, the others still function independently.

New cards

What is Range-Based Sharding

Data is partitioned based on ranges of values.
Example:
- Users A-M go to Shard 1.
- Users N-Z go to Shard 2.
✅ Pros: Simple to implement.
❌ Cons: Can lead to hotspots (uneven traffic distribution).

New cards

Hash Based Sharding

Uses a hash function to distribute data across shards.
Example:
- hash(UserID) % Number of Shards → Assigns a user to a specific shard.
✅ Pros: Even distribution of data.
❌ Cons: Harder to reallocate shards when adding/removing servers.

New cards

Geo-Based Sharding

Data is partitioned by geographical region.
Example:
- Users in North America → Shard 1.
- Users in Europe → Shard 2.
✅ Pros: Improves latency for global applications.
❌ Cons: May cause data imbalance if one region has significantly more users.

New cards

When is it acceptable to use Eventual Consistency

Social media (e.g., likes, comments, follows)
Notification systems
Analytics dashboards
Search indexes

These are use cases where data freshness isn’t critical to user experience, and availability + speed are more important.

New cards

What is the difference between strong consistency and eventual consistency?

Strong Consistency: Every read gets the most recent write. No stale data is ever returned. Ensures data correctness but may increase latency.
- Example: A banking system where your balance must always be accurate.
Eventual Consistency: After a write, data propagates to all replicas over time. Reads may return stale data temporarily, but eventually, all replicas synchronize.
- Example: Social media feeds (e.g., you may not see a new comment instantly, but it appears after a few seconds).

New cards

What does consistency mean in the context of databases and distributed systems?

Consistency ensures that all nodes in a distributed system see the same data at the same time. If a user updates a record, the next read operation should return the updated value.

✅ Example: In a banking system, if you transfer money, the new balance should be immediately reflected everywhere to prevent overdrafts.

New cards

How do caching systems (e.g., Redis, Memcached) improve system performance, and what are their trade-offs?

Caching stores frequently requested data in memory (e.g., Redis, Memcached) to avoid expensive database queries.
Key-value storage (hashmaps) ensures fast lookups, often in O(1) time complexity.
Trade-offs include complexity—managing cache invalidation, expiration, and synchronization with the database.

Additional Considerations for Depth:

Cache Expiry (TTL - Time to Live): Cached data must be refreshed periodically to prevent serving stale data.
Cache Eviction Policies: Common strategies like LRU (Least Recently Used) help manage limited memory.
Write-through vs. Write-back caching:
- Write-through: Writes go to both cache & DB (slower but more consistent).
- Write-back: Writes go to cache first, then asynchronously to DB (faster, but riskier).
- Caching systems improve system performance by storing frequently requested data in memory to reduce expensive database queries, offering fast lookups with trade-offs in complexity around cache management.

New cards

what are some common use cases for message queues

Order processing systems
Sending emails or SMS
Logging and analytics pipelines
Streaming systems (Kafka pipelines)

New cards

what are some use cases for the CAP Theorem

Explaining trade-offs between Consistency, Availability, and Partition tolerance in distributed systems.

New cards

what is a write-through

caching strategy where writes occur to both the cache and the database simultaneously, ensuring data consistency at the cost of speed.

New cards

what is a write-back

caching strategy where writes occur only in the cache first and are later written to the database. This improves performance but can risk data loss if not managed properly.

New cards

what are cache eviction policies

Strategies to remove stale data from cache in order to free up space and maintain performance. Common policies include LRU, FIFO, and LFU.

New cards

what is a cache expiry

A mechanism that automatically invalidates or removes cache entries after a specified period, ensuring that data is kept fresh and up-to-date.

New cards

Questions for Traffic and Load

“What’s the expected traffic volume or peak QPS?”

New cards

Questions for Consistency Model

“Is strong consistency required, or is eventual consistency acceptable?”

New cards

Questions for Data Storage

“Is this more suitable for a relational or NoSQL database?”

“Do we need relational (ACID) or NoSQL (eventually consistent, flexible schema) storage?”
“How do we model our data — what are the write and read patterns?”
“Do we need sharding to scale the database?”
“Are there performance-critical paths we need to optimize?”
“Would it make sense to cache user sessions, profiles, or other frequently read data?”

New cards

Questions for Scalability

“Do we need to scale horizontally or vertically in this system?”

New cards

Questions for Availability

“What kind of uptime or fault tolerance does the system need?”

New cards

Questions for caching

“Are there specific endpoints or datasets that need to be cached for performance?”

New cards

Questions for Messaging / Queues

“Should parts of this system be asynchronous, or is everything synchronous?”

“Should any part of the system be async — for example, background processing, notifications, or retries?”
“Would using a queue help smooth traffic spikes?”

New cards

Questions for API Design

“What are the most critical APIs, and how should they be secured?”

“What are the main actions users can take in the system?”
“What are the most critical APIs we’ll need?”

New cards

Questions for Resilience

“Should we implement circuit breakers or rate limiting to handle failure or abuse?

New cards

Questions for geographic concerns

“Will users be global? Do we need multi-region or CDN support?

New cards

Questions to kick off the system design

“What are the core functional requirements?”
“What are the most important non-functional requirements — availability, latency, consistency?”
“Is this a read-heavy or write-heavy system?”
“What’s the expected scale — users per day, requests per second, data size?”

New cards

Questions for Cap Theorem

“Should we support horizontal scaling across services or databases?”
“Do we need active-active regions or failover between regions?”
“What happens if part of the system fails?”

New cards

Questions for Security & Observability

“How are we handling authentication and authorization?”
“Do we need to track metrics, set alerts, or monitor API latency/errors?”