Bloomberg System Design

0.0(0)
studied byStudied by 1 person
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/78

flashcard set

Earn XP

Description and Tags

Pray I get this please GOD

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

79 Terms

1
New cards

scalability

the ability of a system to handle increased workload without a degredation in performance or efficiency

2
New cards

performance

ability to deliver fast responses with low latency and the ability to handle many requests efficiency

3
New cards

If you have a performance problem…

your system is slow for a single user

4
New cards

If you have a scalability problem…

your system is fast for a single user but slow under a heavy load

5
New cards

latency

time to perform some action or to produce some result

6
New cards

throughput

number of such actions or results per unit of time

7
New cards

Generally, should aim for maximal ___ with acceptable ___

throughput, latency

8
New cards

consistency

every read receives the most recent write or error

9
New cards

availability

every request receives a response, without guarantee that it contains the most recent version of the information

10
New cards

partition tolerance

system continues to operate despite arbitrary partitioning due to network failures

11
New cards

Networks are not reliable, so every system needs to support ___; you’ll need to make a software tradeoff between ___ and ___

partition tolerance, consistency, availability

12
New cards

consistency and partition tolerance (CP)

guarantees that all clients see the same, most up-to-date data even during network partitions by potentially rejecting requests or timing out instead of returning stale or conflicting data

13
New cards

availability and partition tolerance (AP)

guarantees that every request receives a response even during network partitions by allowing nodes to return potentially stale or divergent data that will be reconciled later

14
New cards

CP is a good choice when…

business needs require atomic reads and writes

15
New cards

AP is a good choice when…

business needs allow for eventual consistency or when the system needs to continue working despite external errors

16
New cards

atomic reads

read operation that returns a single, indivisible, and fully completed value; never a partial intermediate or mixed result of a write

17
New cards

atomic writes

write operation that is applied as a single, indivisible operation; either the entire update is successfully committed or none of it is with no partial effects visible to any reader

18
New cards

weak consistency

no guarantee that a read will return the most recent write; reads may return outdated or incomplete data

19
New cards

eventual consistency

guarantees that if no new writes are made, all replicas will eventually converge to the same value; reads may temporarily return stale data but eventually all reads reflect latest write

20
New cards

strong consistency

guarantee that every read immediately reflects the most recent write; all clients see the same, up-to-date data

21
New cards

weak consistency is seen in…

memcached, video chat, realtime multiplayer games

22
New cards

eventual consistency is seen in…

DNS, email

23
New cards

eventual consistency works well with…

highly available systems

24
New cards

strong consistency is seen in…

file systems, RDBMSes

25
New cards

strong consistency works well in…

systems that need transactions

26
New cards

two complementary patterns to support high availability are…

fail-over and replication

27
New cards

fail-over

responsibility for a service automatically switches from a failed component to a standby component to minimize downtime

28
New cards

replication

maintaining multiple copies of same data across different nodes to improve availability, fault tolerance, or performance

29
New cards

active-passive

fail-over setup where one server handles all traffic (active) while another remains on standby and takes over only if the active server fails (passive)

30
New cards

active-active

multiple servers simultaneously handle traffic and share the workload, continuing service if one fails

31
New cards

disadvantages to failover

hardware overhead: requires additional machines that may be idle or underutilized

replication lag data loss: if active node fails before recent writes are replicated, those writes may be lost

operational complexity: fail-over systems require careful coordination, monitoring, and recovery logic

32
New cards

availability in sequence

where all components must be operational for system to work; failure of any component causes system failure

33
New cards

availability in parallel

system works as long as at least one component is operational

34
New cards

heartbeat

periodic signal exchanged between servers to confirm the active server is still functioning

35
New cards

hot standby

passive server that is already running and synchronized, allowing near-immediate takeover

36
New cards

cold standby

passive server that is powered off or not fully initialized and must start up before taking over, causing longer downtime

37
New cards

master-slave replication

one master node handles writes and propagates changes to one or more read-only slave nodes

38
New cards

master-master replication

multiple nodes can accept writes and synchronize changes between each other

39
New cards

uptime

amount of time a system is functioning correctly

40
New cards

downtime

amount of time a system is unavailable due to failures or maintenence

41
New cards

number of 9s

shorthand way to describe percentages

42
New cards

99.9% availability (three 9s)

system that may be unavailable for up to ~8.8 hours per year

43
New cards

99.99% availability (four 9s)

system that may be unavailable for up to ~53 minutes per year

44
New cards

sequential availability formula

Availability (Total) = Availability (1) * Availability (2) * … * Availability (n)

45
New cards

parallel availability formula

Availability (Total) = 1 - (1 - Availability (1)) * (1 - Availability (2)) * … * (1- Availability (n)))

46
New cards

domain name system (DNS)

distributed naming system that maps human-readable domain names (www.example.com) to machine-readable IP addresses

47
New cards

time to live (TTL)

value that specifies how long a response may be cached before it’s refreshed

48
New cards

managed DNS service

third-party service that operates DNS infrastructure on behalf of domain owners, handling scalability, availability, configuration

49
New cards

examples of managed DNS services

CloudFlare, Route 53

50
New cards

DNS disadvantages

  1. time delay introduced when resolving domain name to IP address (if not cached)

  2. DNS server management is done by governments, ISPs, large companies so lack of control on business end

  3. DNS can be attacked by distributed denial of service attacks, preventing domain name resolution even if application servers are healthy

51
New cards

content delivery network (CDN)

globally distributed network of proxy servers providing content from locations closer to user

52
New cards

What type of content do CDNs serve?

static files (HTML,CSS,JS), photos, videos

53
New cards

CDNs benefits

  1. Users receive content from data centers close to them

  2. Your servers do not have to serve requests that CDN fulfills

54
New cards

push CDNs

content is proactively uploaded to CDN servers by the content owner whenever it is created or updated

55
New cards

pull CDNs

content is fetched from the origin server automatically when a user first requests it

56
New cards

push CDNs best use case

websites with relatively low traffic or infrequently updated static content

57
New cards

pull CDNs best use case

websites with high traffic and frequently accessed content

58
New cards

content expiration (push CDNs)

rules that define when cached content should be considered invalid and replaced with a newer version

59
New cards

store-efficient caching (pull CDNs)

a characteristic of pull CDNs where only recently requested content is stored on CDN servers

60
New cards

CDN disadvantages

  1. costs could be significant based on bandwidth usage, requests, storage

  2. users could receive outdated content because cached copies have not yet expired or been invalidated

  3. CDNs require changing URLs for static content to point to the CDN

61
New cards

load balancer

system that distributes incoming client requests across multiple backend servers and returns responses to the correct client

62
New cards

load balancer advantages

  1. prevents requests from going to unhealthy servers

  2. prevents overloading resources

  3. helps eliminate single point of failure

63
New cards

layer 4 load balancing

makes routing decisions using IP addresses and ports without inspecting packet contents

64
New cards

layer 7 load balancing

makes routing decisions based on application data such as HTTP headers, URLs, cookies, or request content

65
New cards

layer 4 vs layer 7

layer 4 is faster, simpler, lower overhead, less flexible

layer 7 is flexible, intelligent, higher complexity and resource usage

66
New cards

load balancer caveat for sessions

sessions must be stored in shared systems like database or distributed caches (Redis, memcached)

67
New cards

load balancer disadvantages

  1. performance bottleneck if not enough resources or not configured properly

  2. increased complexity

  3. single load balancer is a single point of failure but configuring multiple load balancers adds complexity

68
New cards

reverse proxy

server that sits between external clients and one or more backend servers, receives client requests, forwards them to the appropriate backend server, and returns the backend’s response to the client

69
New cards

key property of reverse proxy

clients never communicate directly with backend servers; they only interact with the reverse proxy (protects servers)

70
New cards

IP blacklisting

practice of blocking incoming requests from specific client IP addresses at the proxy layer before they reach backend servers

71
New cards

forward proxy vs reverse proxy vs load balancer

  1. forward proxy - server that sits between clients and the internet, forwarding client requests to external servers on the clients’ behalf; represents the client; destination server does not know real client, only sees the forward proxy

  2. reverse proxy - server that sits between clients and backend servers, forwarding incoming client requests to internal services and returning their responses; represents the server; client does not know the real backend server, only sees reverse proxy; can be used with one server

  3. load balancer - type of reverse proxy that’s main purpose is rerouting client requests to appropriate servers; does not necessarily serve to protect although that is a byproduct; does not necessarily provide caching

72
New cards

reverse proxy disadvantages

  1. single point of failure

  2. increased complexity

73
New cards

application layer

system layer that implements business logic, data processing, and domain-specific functionality, typically exposed through APIs

74
New cards

asynchronous processing

processing model where tasks are executed independently of the request-response lifecycle, allowing long-running or background work to proceed without blocking client requests

75
New cards

application workers

background processes in the application layer that consume tasks from queues (e.g., jobs, events, messages) and execute them asynchronously

76
New cards

microservices

architectural style where an application is composed of a collection of small, independently deployable services, each responsible for a specific business capability and communicating via lightweight protocols

77
New cards

service discovery

mechanism that enables services to dynamically locate and communicate with each other by resolving service names to network locations

78
New cards

service registry

centralized system that maintains a real-time list of available services, their network locations, and their health status

79
New cards

application layer disadvantages

  1. coordinating multiple independent services, requiring careful design of interfaces, data consistency, and communication patterns

  2. microservices can add increased difficulty in deployment, monitoring, debugging, scaling, and failure recovery due to many independently running services