Chapter 4: Design a Rate-Limiter

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/9

There's no tags or description

Looks like no tags are added yet.

Last updated 11:56 PM on 3/18/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

10 Terms

New cards

What is the token bucket algorithm?

Bucket holds tokens up to max capacity, refilled at a fixed rate. Each request consumes one token. No tokens = request dropped. Pro: simple, memory efficient, allows short bursts. Con: hard to tune bucket size and refill rate correctly.

New cards

What is the leaking bucket algorithm?

Requests added to a FIFO queue, processed at a fixed outflow rate. Queue full = request dropped. Pro: stable outflow rate, memory efficient. Con: burst traffic fills queue with old requests, newer requests get dropped.

New cards

What is the fixed window counter algorithm?

Divides time into fixed windows, each with a counter. Request increments counter, exceeds threshold = dropped. Pro: simple, memory efficient. Con: burst at window edges can let through 2x allowed requests.

New cards

What is the sliding window log algorithm?

Tracks exact timestamps of requests in a rolling window. Outdated timestamps removed on each new request. Pro: very accurate, no edge burst problem. Con: memory intensive — rejected request timestamps still stored.

New cards

What is the sliding window counter algorithm?

Hybrid of fixed window and sliding window log. Formula: current window requests + previous window requests × overlap percentage. Pro: smooths spikes, memory efficient. Con: approximate not exact — assumes even distribution of previous window requests.

New cards

What might Step 1 clarifying questions look like for a rate limiter?

Client-side or server-side? Throttle by IP, user ID, or other properties? Distributed environment? Separate service or in application code? Do we inform users when throttled?

New cards

What are the requirements after Step 1 for a rate limiter?

Accurately limit excessive requests. Low latency — don't slow HTTP response. Memory efficient. Distributed — shared across servers. Clear exceptions shown to throttled users. Fault tolerant — cache failure shouldn't take down the system.

New cards

Give a brief overview of each step for designing a rate limiter.

Step 1: Clarify — client vs server, throttle by IP or user, distributed?. Step 2: High-level — where to place limiter, which algorithm, Redis for counters. Step 3: Deep dive — rules storage, what happens when limit hit, race conditions, synchronization, monitoring. Step 4: Wrap up — layer 4 vs layer 7, client best practices.

New cards

What are the two challenges of rate limiting in a distributed environment?

(1) Race condition — two servers read same counter simultaneously, both increment, final count is wrong. Fix with Redis atomic operations. (2) Synchronization — multiple rate limiter servers don't share state. Fix with centralized Redis store instead of local memory.

New cards

How are rate limiting rules stored and accessed?

Rules are config files stored on disk. Background workers periodically pull rules from disk and load them into cache. Middleware reads from cache at request time — not disk — to keep latency low.