System Design Trade-Offs

0.0(0)
studied byStudied by 1 person
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/24

flashcard set

Earn XP

Description and Tags

Q&A flashcards to review the main system design trade-offs: polling, database modeling, messaging, concurrency, and performance vs. latency.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

25 Terms

1
New cards

What are the main system design trade-offs?

  • Polling Schemes (Polling, Long Polling, SSE, WebSockets)

  • Messaging & Queues (Kafka, RabbitMQ, SQS, etc.)

  • Locking schemes (Conditional Writes, Redis, Status)

  • Database sharding (latency vs performance sharding)

  • Hot shards (vertical upsizing, fake mapping, reducing frequency of updates)

2
New cards

Rate Limiting Algorithm

Token Bucket

Refill rate and capacity of the bucket

Explain redis holds last timestamp and total tolens left

every operation updates last time accessed along with tokens left and do this atomicallly using lua scripts

expire keys with more than 1hr of old data.

3
New cards

How do you scale to a lot of users

Consistent hashing along with replication of data

Redis cluster uses hash slots.

4
New cards

Minimize Latency overhead

Connection pooling long lived TCP connections. Avoids TCP handshake

Geographical distribution distribute keys to where the usser is (works for 99% god users)

Aggregated queries and caches.

5
New cards

What does rate limiting teach us about maximizing availiblity

Use Redis shard master slave architecture along with asynchronous writes to maximize availiblity as some data loss is acceptable.

6
New cards

What are the 4 polling schemes discussed?

  • Periodic polling

  • Long polling

  • Server-Sent Events (SSE)

  • WebSockets

7
New cards

What are the 5 concurrency/locking approaches?

  • Conditional Writes - Dynamo DB feature to ensure atomic writes go through only if certain condition is satisfied.

  • Database locks — useful when lock duration is short.

    • Limitation: only valid if the locking period is on the order of seconds.

  • Status table — a separate table or column tracks the status of operations.

  • Redis lock (Redlock) — distributed locking using Redis.

  • Optimistic concurrency control — uses version/timestamp; retries if conflicts occur.

8
New cards

What’s the difference between performance-oriented vs latency-oriented design?

  • Performance-Oriented: Optimized for throughput, can accept latency.

  • Latency-Oriented: Optimized for responsiveness, sacrifices throughput.

9
New cards

How do you shard data for latency vs to avoid hot shards?

  • Same-shard for latency: put related records together in one shard so queries are faster.

    • Con: this can create hot shards (too much load on one shard).

  • Uniform/random distribution: spread records evenly across shards.

    • Pro: avoids hot shards.

    • Con: queries may need to pull from multiple shards, which can increase latency.

10
New cards

What are different messaging queues and tradeoffs

Amazon SQS: Simple queue with visiblity timeout.

Redis Pub/Sub: Fire and Forget approach useful for chat applications and other notification systems

Kafka: Queue with record history useful for auditability and replayability.

11
New cards

When designing notification system what question to ask interviewer

Clarify what kind of notification, web, sma, email push notification go towards a notification database.

12
New cards

How do you deal with a burst of writes like a lot of likes.

Put a message queue

Apply a streaming aggregator to do a bulk update to db and short circuit to message queue

13
New cards

What clarifying questions to ask for a rate limiting system?

Fixed window Go with this

Sliding window heap discuss the other solution

14
New cards

Clarifying question to ask when the user gives you their social media account

How does authentication happen?

15
New cards

Abstract redis sorted sets.

Dont consider the case when all data cant fill in a single shard.

Worst case assume hashing on score id. Using similar concepts like forst pronciple

16
New cards

Aggregator service and heaps?

Break down aggregation into a smaller component for microservice

Mentipn aggregation using hepas

17
New cards

Database optimizations.

Source of____

Reads

Writes

Make database as source of truth.

Always try to ensure reads are coming from caches and avoid reads from data ase

Always do aggregated bupk writes ok database by having aggregator servicez

18
New cards

Media Optimization Step

1) Initially send everything through ___ REST request

POST request but sizing restrictions.

19
New cards

Media Optimization Part 2:

Pre signed URL

Divide input into chunks

MetadataDB stores S3 link along with status

20
New cards

Media Optimization Part 3:

Transcoding

Transcode into different formats asynchrpnosuly

Traseoff betwen more formats and cost versus doing on the fly transcoding.

21
New cards

Media Optimization Part 4. Old data optimization.

Store old data into S3 glacier

22
New cards

What does typeahead system tell us about numbers and what does that tell is about data layer

Ask the number from the interviewer if it looks right and if it deosnt assume you can always shard.

Always shard the data layer and explain how it is done.

23
New cards

Asynchronous system updates

Always have asynchronous systems creare update with versions into S3 or database as source of truth

Then have an updater service whose job is to replicate this.

24
New cards

What does typeahead system tell us about data schema

For trie service mention the data schema in detail.

25
New cards

Typehaesd system perosnalization

Store the trie (top 1000) and have feature set saved for each user and do a dot product