1/13
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What problem does consistent hashing solve in distributed systems?
It minimizes data redistribution when nodes (e.g., databases or servers) are added or removed.
What is sharding?
Sharding is the process of splitting data across multiple servers to handle more load.
What is a simple way to shard data using hashing?
Use hash(key) % number_of_servers
to determine which server stores the data.
What is the main issue with simple modulo hashing?
Adding or removing servers causes massive data reshuffling across all nodes.
What is a hash ring in consistent hashing?
A circular key space where both data and servers are mapped using a hash function.
How is data assigned to servers in consistent hashing?
Hash the data key, locate it on the ring, and move clockwise to the next server.
What happens when a new server is added to the hash ring?
Only data between the new server and its counter-clockwise neighbor needs to be moved.
What happens when a server is removed from the hash ring?
Only the data it handled needs to be reassigned to the next server clockwise.
What are virtual nodes in consistent hashing?
Multiple hashed points for a single physical node to better distribute load.
Why use virtual nodes in consistent hashing?
To evenly distribute data and avoid overloading a single node when others fail.
How do virtual nodes improve fault tolerance?
They allow the load of a failed server to be spread across multiple remaining servers.
What real-world systems use consistent hashing?
Redis Cluster, Apache Cassandra, Amazon DynamoDB, and CDNs.
When should you mention consistent hashing in a system design interview?
When asked to design distributed systems like a cache, database, or message broker.
What is the core idea behind consistent hashing’s efficiency?
Changes affect only a small portion of the data, not the whole system.