In-Region Traffic Optimization and API Design Principles for Distributed Systems

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/21

There's no tags or description

Looks like no tags are added yet.

Last updated 6:30 PM on 7/5/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai	Chat

No analytics yet

Send a link to your students to track their progress

22 Terms

New cards

Keep traffic in-region

Try to handle an entire user request inside a single region so you avoid slow, failure-prone cross-region network hops.

New cards

Why avoid cross-region calls?

Cross-region calls add ~50-150ms latency each way, extra cost, and more failure modes, so p95/p99 latency and reliability get worse.

New cards

Good use of cross-region traffic

Use cross-region links mainly for async replication, backups, and rare control-plane operations, not for every user request.

New cards

Bad use of cross-region traffic

Designs where a single user request synchronously calls multiple services in another region, stacking latency and coupling regions.

New cards

Single-region hot path principle

The 'hot path' for a request (login, gameplay, purchase) should be satisfiable with only in-region services and data.

New cards

Async replication between regions

Writes are committed in the local region first, then replicated asynchronously to other regions for read locality and DR.

New cards

Region-local read pattern

Each region serves reads from its own local data (or replicas) instead of reaching across the world for every request.

New cards

Chatty cross-region anti-pattern

When one region calls another many times for one user action, causing multiple high-latency hops and fragile dependencies.

New cards

Batching definition

Collect multiple small operations or messages and send/process them together in a single larger request or DB transaction.

New cards

Why batching helps

Batching amortizes per-request overhead (TLS, syscalls, auth, logging), increasing throughput and stabilizing tail latency.

New cards

Batching tradeoff

Bigger batches improve efficiency but add queueing delay, so you must balance latency vs throughput.

New cards

Batch size tuning

You typically cap batches by both time (e.g., 10-50ms window) and count (e.g., up to N items) to keep latency reasonable.

New cards

Partial failure in batching

When one item in a batch fails, you must choose whether to fail the whole batch or allow partial success with per-item status.

New cards

Compression definition

Compression shrinks payload size (e.g., gzip, Brotli, zstd) in exchange for extra CPU to compress and decompress.

New cards

When to use compression

Use compression for large, text-heavy payloads (JSON, logs, telemetry) where network cost dominates CPU cost.

New cards

When NOT to use compression

Avoid compression for tiny messages and already-compressed formats (images, video, zip files) where it adds overhead without benefit.

New cards

Chatty protocol between services

Many small RPCs between services to fulfill one logical operation, creating latency stacking and complex dependencies.

New cards

Coarse-grained API

An API that exposes a higher-level operation (e.g., GetPlayerDashboard) returning everything needed in one call.

New cards

Why avoid chatty service calls

Chatty calls increase latency, coupling, and risk of cascading failures when any small service misbehaves.

New cards

Backend-for-Frontend (BFF) pattern

A BFF aggregates data from multiple internal services into one coarse-grained API tailored to a specific client or UI.

New cards

Materialized view for reads

Precomputed, read-optimized data (e.g., player snapshot) used so one read can serve many UI fields without multiple service calls.

New cards

Many-small-writes, few-rich-reads pattern

Services keep writes small and localized, but expose richer, aggregated read APIs to avoid chatty request fan-out.