System Design Basics: Caching

Caching

Cache is a small, fast-access local memory that stores frequently accessed data.

Caching is a technique that improves data retrieval times, throughput, and compute costs by storing copies of frequently used application data in a smaller, faster memory layer.

Why Use Cache?

Cache is based on the principle of locality, meaning frequently accessed data is kept close to the system.

Types of Locality:

Temporal Locality: Data referenced recently is likely to be referenced again.
Space-Based Locality: Data items are stored close together in memory (e.g., data in an array).

How a Cache Works

When a request comes to the system, there can be two scenarios:

Cache Hit: The requested data is found in the cache.
Cache Miss: The requested data is not in the cache and must be fetched from the primary data store.

The performance of a cache is measured by the number of cache hits out of the total number of requests.

Types of Cache

Application Server Cache: A cache placed directly on an application server for local storage of response data. If the data exists locally on the server, it will return it. If not, it will query the data from the primary store.
- Can be located in memory (very fast) or on the node's local disk (faster than network storage).
- In distributed systems, this can lead to many cache misses because each instance has its own cache without information on other instances.
Distributed Cache: Each node has a part of the whole cache space. Each request can be routed to where the cache request could be found using a consistent hashing function.
- The cache is divided up using consistent hashing, and each request is routed to the node containing the response.
Global Cache: A single cache memory in front of the server instances, common for all instances. Fetches data from the primary store if not present in the cache.
- Adds latency but has better cache performance.
CDN (Content Distribution Network): Used for serving large amounts of static content (HTML, CSS, JavaScript, pictures, videos).
- Requests are initially directed to the CDN; if the data exists, it's returned. If not, the CDN queries the backend servers and caches the data locally.

Cache Writing Policies

A cache policy includes rules that define how the data will be loaded from cache memory.

Write-Through Caching: Data is written into the cache and the corresponding database simultaneously.
- Ensures fast retrieval and complete consistency.
- Every write operation is done twice which accounts for more latency.
Write-Around Caching: Data is written directly to the primary data store. The cache checks with the primary data store to keep itself in sync.
- The cache might be behind the primary data store.
- The primary data store is always consistent.
Write-Back Caching: Data is written first into the cache, and then into the primary data store.
- The primary storage is updated directly after the cache, or the cache memory is persisted after an entry removal (in case the cache memory was already full). In this scenario, the entry is tagged with a dirty bit to mark that the data is out of sync.
- Prone to data loss.
- Should only be used in write-heavy operations where write speed is very important.

Cache Eviction Policies

Cache eviction policies define rules that decide what data must be removed when the cache is full and a new entry is to be added.

A good replacement policy ensures the cached data is relevant to the application by optimizing for cache hits.

Cache eviction policy is defined on the basis of what data is stored in the cache.

Popular Cache Eviction Algorithms:

First In First Out (FIFO): Evicts the first entry in the cache, regardless of how many times it was called.
Least Recently Used (LRU): Evicts the least recently used items first.
Most Recently Used (MRU): Evicts the most recently used items first.
Least Frequently Used (LFU): Discards entries that are used least often, based on a count of how often an entry was read from cache.
Random Replacement (RR): Randomly selects a candidate item and discards it to make space when necessary.

Redis: Distributed Caching

Redis is an in-memory data store often used as a distributed cache.

Offers a variety of efficient data structures designed to allow fast access to your data.
Used as a caching mechanism in most distributed systems.
Redis has the option to also persist to a disk so the cache isn't lost if the server restarts and can be built in a cluster which spreads the cache across multiple servers.
Technically Redis can even be used as your primary database as well, although more often we use it as a cache to reduce the load on more feature-rich databases that are meant for persisting data.
Written in C, so it does not have multithreading capabilities like it's java based counterparts like Hazelcast.
Supports other data structures like sorted sets, hash sets, and a pub/sub mechanism. Extensible via Lua scripting.