Power System Design & Practice
Page 1: System Design Basics
Problem Decomposition
Break down the problem into simpler modules (Top down approach).
Discuss trade-offs: No solution is perfect; evaluate the impact based on constraints and end test cases.
Interview Focus
Understand the interviewer’s intentions.
Ask abstract questions regarding constraints and functionality requirements.
Bottlenecks
Identify potential bottlenecks in the system.
Page 2: Architectural Insights
Resources Overview
Understand the architectural pieces and resources available.
Explore how these resources work together.
Utilization and Trade-offs
Explore consistent hashing, CAP theorem, load balancing, queues, caching, replication vs NoSQL relationships, indexes, proxies, and data partitioning.
Page 3: Load Balancing
Distributed Systems
Types of load distribution:
Random
Round-robin (considering weights for memory and CPU cycles).
Scalability
Achieve full scalability and redundancy by distributing load across user web servers, app servers, and internal cache servers.
Page 4: Smart Clients
Functionality
Smart clients balance load among service hosts and detect non-responsive hosts.
They recover hosts and add new ones, enhancing database load balancing (cache and service).
Scalability Options
Attractive solution for developers in small-scale systems; can grow with the system.
Consider hardware load balancers (high performance, costly, not trivial to configure).
Page 5: Software Load Balancers
Advantages
No need for creation of smart clients or additional hardware costs.
Using hybrid approaches like HA Proxy offers efficient management of requests.
Deployment Scenarios
Running on client machines (e.g., localhost).
Running on intermediate servers to manage health checks and request balances.
Page 6: Databases Overview
Types of Databases
Structured vs Unstructured.
Predefined schema vs Dynamic schema.
Data Storage Approaches
Data in rows & columns versus separate data points in a dynamic schema.
Examples: MySQL, Oracle, Postgres, MariaDB.
Page 7: (Content Not Provided)
Page 8: (Content Not Provided)
Page 9: Reasons to Use ACID Compliant Databases
Importance of ACID Compliance
Reduces anomalies and maintains database integrity, critical for e-commerce and financial applications.
Data Stability
Suitable for structured and stable data with no rapid growth or changes.
Page 10: Reasons to Use NoSQL
Performance Solutions
Solutions to prevent bottlenecks during querying and searching for large volumes of data.
Cost-effectiveness
Good for storing large volumes with less structure and utilizing cloud and commodity hardware.
Excellent for rapid and agile development (schema changes).
Page 11: CAP Theorem Insights
CAP Theorem
Requires that all nodes see the same data and allows for updates to several nodes at the same time.
Availability
Every request must receive a response (success/failure) despite potential message loss.
Page 12: Limitations of Designing a Datastore
Challenges
It’s impossible to ensure all three: constant availability, sequential consistency, and partition tolerance due to potential network partitions affecting data consistency.
Page 13: Redundancy & Replication
Benefits
Increases system reliability through duplication of critical data and services.
Provides backups and secures against single-node failures.
Page 14: Caching Fundamentals
Load Balancing and Caching
Achieves horizontal scalability and minimizes latency in data retrieval.
Different cache strategies including global and distributed caches.
Page 15: Distributed Caching
Consistent Hashing
To efficiently manage caches by distributing keys across nodes.
Challenges with node disappearance and request management.
Page 16: Centralized Cache Challenges
Challenges
Managing a single cache space can be complicated if usage spikes.
Global cache forms to streamline data retrieval but require effective eviction strategies.
Page 17: CDN (Content Distribution Networks)
Purpose
Helps save large amounts of static media across sites.
Handles case scenarios with variable media availability.
Page 18: Cache Invalidation Strategies
Need for Coherence
Cached data must remain coherent with database modifications.
Different strategies: Write-through, Write-around, Write-back caching.
Page 19: Cache Eviction Policies
Policies
Methods include FIFO, LIFO, FILO, LRU, and Random Replacement.
Page 20: Sharding and Data Partitioning
Understanding Sharding
Refers to splitting tables across machines to enhance management, performance, and availability.
Different methods: horizontal and vertical partitioning.
Page 21: Vertical Partitioning
Use Cases
Based on features, e.g., separating user info, followers, and photos into different servers.
Page 22: Partitioning Criteria
Methods of Partitioning
Key or hash-based partitioning.
List partitioning.
Round robin partitioning.
Page 23: Composite Partitioning Challenges
Problems Encountered
Difficulty in joins across sharded databases can lead to inefficiencies.
Page 24: Denormalization Impacts
Consequences of Denormalization
Helps with efficiency but may create data inconsistencies and integrity challenges.
Page 25: Rebalancing Shards
Indicators for Change
Need arises due to non-uniform distribution of data and requests.
Page 26: Indexing Benefits
Purpose of Indexes
Improves retrieval speed at the cost of increased storage overhead.
Page 27: Handling Requests Efficiently
Handling Loads
Combine requests to minimize heavy load on backend systems.
Page 28: Queue Mechanism Overview
Queueing in Distributed Systems
Manages requests efficiently while ensuring asynchronous performance.
Page 29: Queue Functionality and QoS
Quality of Service
Ensures clients are protected from service outages and guarantees proper request handling.
Page 30: Consistent Hashing Explanation
Overview of Consistent Hashing
Key for scalable distributed systems; minimizes reorganization during scaling.
Page 31: Implementation of Consistent Hashing
How It Works
Concepts of mapping servers and keys on a circular hash ring, allowing efficient node addition/removal.
Page 32: Key-Server Mapping Strategy
Mapping Outline
Assigning keys to servers through hash functions ensuring minimal re-mapping.
Page 33: Dynamic Server Management
Keys Reassignment on Server Changes
Strategies for maintaining key assignment efficiency when servers are added or removed.
Page 34: Virtual Replicas for Load Balancing
Distribution Improvements
Mapping to multiple points improves load balancing through increased replicas.
Page 35: Long Polling vs Web Sockets
Communication Protocols
Understanding the need for efficient and real-time data communication techniques.
Page 36: Long Polling Mechanism
Detailed Functionality
How long polling differs from standard HTTP requests and its operational logic.
Page 37: Web Sockets Benefits
Persistent Communication
Overview of the advantages of bidirectional communication channels through WebSockets.
Page 38: Server-Sent Events (SSE)
Long-term Connections
Establishes persistent connections for real-time data transfer from server to client.