1/44
These flashcards cover key terminology and concepts related to system design, focusing on scaling, databases, message queuing, caching, and consistency models.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Horizontal Scaling
Scaling Out by adding more machines (e.g., servers) in a distributed system.
Vertical Scaling
Scaling Up by adding more power (CPU, RAM) to a single machine.
Load Balancer
A device that distributes incoming traffic across multiple servers to improve fault tolerance and performance.
Relational Database (SQL)
A database to use when you need strong consistency, ACID transactions, and structured data.
NoSQL Database
A type of database suitable for high scalability and flexible schema, often used for unstructured data.
Message Queue
A service used to decouple services and enable asynchronous processing.
Dead Letter Queue (DLQ): 💯 Nice touch! DLQs store failed messages so you can inspect or retry them later, improving fault tolerance.
Caching
A way to improve system performance by storing frequently accessed data in memory.
CAP Theorem
A principle stating that a distributed system can provide only two out of three guarantees: Consistency, Availability, and Partition Tolerance.
Eventual Consistency
A model where data updates propagate over time without guaranteeing immediate consistency.
Sharding
The process of splitting a large database into smaller partitions ('shards') to enhance scalability and performance.
CDN (Content Delivery Network)
A network of servers that caches content to improve access speed and reduce latency for users.
API Gateway
A single entry point for managing multiple microservices, providing routing, authentication, and security functions.
Rate limiting – Protects services from abuse (e.g., 100 req/min/user).
Request/response transformation – Format or modify payloads between clients and services.
Load balancing – Distributes incoming traffic across healthy instances.
Caching – Reduces load on backend services by storing responses temporarily.
Monitoring/logging – Centralized logging and metrics (great for debugging and observability).
Service discovery integration – Works with service registries like Consul or Eureka to find service instances.
Strong Consistency
A consistency model where every read returns the most recent write, ensuring no stale data is returned.
Eventual Consistency
A consistency model where data can be temporarily stale but will be consistent over time.
CP (Consistency + Partition Tolerance)
Sacrifices availability
If the network is partitioned, the system will reject requests to avoid returning stale data.
✅ Used in systems like banking, inventory, financial trading.
AP (Availability + Partition Tolerance)
Sacrifices consistency
System always responds, but may serve stale data until partitions heal.
✅ Used in social media, DNS, caching systems.
CA (Consistency + Availability) — ⚠ Not achievable in distributed systems if a partition occurs.
In theory only — when no network failures exist.
In real-world distributed systems, partition tolerance is a must, so this combination doesn’t exist in practice.
Benefits of a message queue
Decoupling: Producer and consumer don’t need to be online at the same time.
Scalability: You can scale consumers horizontally to process large volumes.
Resilience: If a consumer goes down, messages are still stored in the queue.
Backpressure handling: Prevents services from being overwhelmed.
Why use sharding
Improves scalability: Each shard handles a subset of data, preventing a single DB from becoming a bottleneck.
Enhances performance: Queries run faster because they search in a smaller dataset.
Supports high availability: If one shard fails, the others still function independently.
What is Range-Based Sharding
Data is partitioned based on ranges of values.
Example:
Users A-M go to Shard 1.
Users N-Z go to Shard 2.
✅ Pros: Simple to implement.
❌ Cons: Can lead to hotspots (uneven traffic distribution).
Hash Based Sharding
Uses a hash function to distribute data across shards.
Example:
hash(UserID) % Number of Shards → Assigns a user to a specific shard
.
✅ Pros: Even distribution of data.
❌ Cons: Harder to reallocate shards when adding/removing servers.
Geo-Based Sharding
Data is partitioned by geographical region.
Example:
Users in North America → Shard 1.
Users in Europe → Shard 2.
✅ Pros: Improves latency for global applications.
❌ Cons: May cause data imbalance if one region has significantly more users.
When is it acceptable to use Eventual Consistency
Social media (e.g., likes, comments, follows)
Notification systems
Analytics dashboards
Search indexes
These are use cases where data freshness isn’t critical to user experience, and availability + speed are more important.
What is the difference between strong consistency and eventual consistency?
Strong Consistency: Every read gets the most recent write. No stale data is ever returned. Ensures data correctness but may increase latency.
Example: A banking system where your balance must always be accurate.
Eventual Consistency: After a write, data propagates to all replicas over time. Reads may return stale data temporarily, but eventually, all replicas synchronize.
Example: Social media feeds (e.g., you may not see a new comment instantly, but it appears after a few seconds).
What does consistency mean in the context of databases and distributed systems?
Consistency ensures that all nodes in a distributed system see the same data at the same time. If a user updates a record, the next read operation should return the updated value.
✅ Example: In a banking system, if you transfer money, the new balance should be immediately reflected everywhere to prevent overdrafts.
How do caching systems (e.g., Redis, Memcached) improve system performance, and what are their trade-offs?
Caching stores frequently requested data in memory (e.g., Redis, Memcached) to avoid expensive database queries.
Key-value storage (hashmaps) ensures fast lookups, often in O(1) time complexity.
Trade-offs include complexity—managing cache invalidation, expiration, and synchronization with the database.
Additional Considerations for Depth:
Cache Expiry (TTL - Time to Live): Cached data must be refreshed periodically to prevent serving stale data.
Cache Eviction Policies: Common strategies like LRU (Least Recently Used) help manage limited memory.
Write-through vs. Write-back caching:
Write-through: Writes go to both cache & DB (slower but more consistent).
Write-back: Writes go to cache first, then asynchronously to DB (faster, but riskier).
Caching systems improve system performance by storing frequently requested data in memory to reduce expensive database queries, offering fast lookups with trade-offs in complexity around cache management.
what are some common use cases for message queues
Order processing systems
Sending emails or SMS
Logging and analytics pipelines
Streaming systems (Kafka pipelines)
what are some use cases for the CAP Theorem
Explaining trade-offs between Consistency, Availability, and Partition tolerance in distributed systems.
what is a write-through
caching strategy where writes occur to both the cache and the database simultaneously, ensuring data consistency at the cost of speed.
what is a write-back
caching strategy where writes occur only in the cache first and are later written to the database. This improves performance but can risk data loss if not managed properly.
what are cache eviction policies
Strategies to remove stale data from cache in order to free up space and maintain performance. Common policies include LRU, FIFO, and LFU.
what is a cache expiry
A mechanism that automatically invalidates or removes cache entries after a specified period, ensuring that data is kept fresh and up-to-date.
Questions for Traffic and Load
“What’s the expected traffic volume or peak QPS?”
Questions for Consistency Model
“Is strong consistency required, or is eventual consistency acceptable?”
Questions for Data Storage
“Is this more suitable for a relational or NoSQL database?”
“Do we need relational (ACID) or NoSQL (eventually consistent, flexible schema) storage?”
“How do we model our data — what are the write and read patterns?”
“Do we need sharding to scale the database?”
“Are there performance-critical paths we need to optimize?”
“Would it make sense to cache user sessions, profiles, or other frequently read data?”
Questions for Scalability
“Do we need to scale horizontally or vertically in this system?”
Questions for Availability
“What kind of uptime or fault tolerance does the system need?”
Questions for caching
“Are there specific endpoints or datasets that need to be cached for performance?” |
Questions for Messaging / Queues
“Should parts of this system be asynchronous, or is everything synchronous?”
“Should any part of the system be async — for example, background processing, notifications, or retries?”
“Would using a queue help smooth traffic spikes?”
Questions for API Design
“What are the most critical APIs, and how should they be secured?”
“What are the main actions users can take in the system?”
“What are the most critical APIs we’ll need?”
Questions for Resilience
“Should we implement circuit breakers or rate limiting to handle failure or abuse?
Questions for geographic concerns
“Will users be global? Do we need multi-region or CDN support?
Questions to kick off the system design
“What are the core functional requirements?”
“What are the most important non-functional requirements — availability, latency, consistency?”
“Is this a read-heavy or write-heavy system?”
“What’s the expected scale — users per day, requests per second, data size?”
Questions for Cap Theorem
“Should we support horizontal scaling across services or databases?”
“Do we need active-active regions or failover between regions?”
“What happens if part of the system fails?”
Questions for Security & Observability
“How are we handling authentication and authorization?”
“Do we need to track metrics, set alerts, or monitor API latency/errors?”