Power System Design & Practice

Page 1: System Design Basics

Problem Decomposition
1. Break down the problem into simpler modules (Top down approach).
2. Discuss trade-offs: No solution is perfect; evaluate the impact based on constraints and end test cases.
Interview Focus
- Understand the interviewer’s intentions.
- Ask abstract questions regarding constraints and functionality requirements.
Bottlenecks
- Identify potential bottlenecks in the system.

Page 2: Architectural Insights

Resources Overview
1. Understand the architectural pieces and resources available.
2. Explore how these resources work together.
Utilization and Trade-offs
- Explore consistent hashing, CAP theorem, load balancing, queues, caching, replication vs NoSQL relationships, indexes, proxies, and data partitioning.

Page 3: Load Balancing

Distributed Systems
- Types of load distribution:
  - Random
  - Round-robin (considering weights for memory and CPU cycles).
Scalability
- Achieve full scalability and redundancy by distributing load across user web servers, app servers, and internal cache servers.

Page 4: Smart Clients

Functionality
- Smart clients balance load among service hosts and detect non-responsive hosts.
- They recover hosts and add new ones, enhancing database load balancing (cache and service).
Scalability Options
- Attractive solution for developers in small-scale systems; can grow with the system.
- Consider hardware load balancers (high performance, costly, not trivial to configure).

Page 5: Software Load Balancers

Advantages
- No need for creation of smart clients or additional hardware costs.
- Using hybrid approaches like HA Proxy offers efficient management of requests.
Deployment Scenarios
1. Running on client machines (e.g., localhost).
2. Running on intermediate servers to manage health checks and request balances.

Page 6: Databases Overview

Types of Databases
1. Structured vs Unstructured.
2. Predefined schema vs Dynamic schema.
Data Storage Approaches
- Data in rows & columns versus separate data points in a dynamic schema.
- Examples: MySQL, Oracle, Postgres, MariaDB.

Page 7: (Content Not Provided)

Page 8: (Content Not Provided)

Page 9: Reasons to Use ACID Compliant Databases

Importance of ACID Compliance
- Reduces anomalies and maintains database integrity, critical for e-commerce and financial applications.
Data Stability
- Suitable for structured and stable data with no rapid growth or changes.

Page 10: Reasons to Use NoSQL

Performance Solutions
- Solutions to prevent bottlenecks during querying and searching for large volumes of data.
Cost-effectiveness
- Good for storing large volumes with less structure and utilizing cloud and commodity hardware.
- Excellent for rapid and agile development (schema changes).

Page 11: CAP Theorem Insights

CAP Theorem
- Requires that all nodes see the same data and allows for updates to several nodes at the same time.
Availability
- Every request must receive a response (success/failure) despite potential message loss.

Page 12: Limitations of Designing a Datastore

Challenges
- It’s impossible to ensure all three: constant availability, sequential consistency, and partition tolerance due to potential network partitions affecting data consistency.

Page 13: Redundancy & Replication

Benefits
- Increases system reliability through duplication of critical data and services.
- Provides backups and secures against single-node failures.

Page 14: Caching Fundamentals

Load Balancing and Caching
- Achieves horizontal scalability and minimizes latency in data retrieval.
- Different cache strategies including global and distributed caches.

Page 15: Distributed Caching

Consistent Hashing
- To efficiently manage caches by distributing keys across nodes.
- Challenges with node disappearance and request management.

Page 16: Centralized Cache Challenges

Challenges
- Managing a single cache space can be complicated if usage spikes.
- Global cache forms to streamline data retrieval but require effective eviction strategies.

Page 17: CDN (Content Distribution Networks)

Purpose
- Helps save large amounts of static media across sites.
- Handles case scenarios with variable media availability.

Page 18: Cache Invalidation Strategies

Need for Coherence
- Cached data must remain coherent with database modifications.
- Different strategies: Write-through, Write-around, Write-back caching.

Page 19: Cache Eviction Policies

Policies
- Methods include FIFO, LIFO, FILO, LRU, and Random Replacement.

Page 20: Sharding and Data Partitioning

Understanding Sharding
- Refers to splitting tables across machines to enhance management, performance, and availability.
- Different methods: horizontal and vertical partitioning.

Page 21: Vertical Partitioning

Use Cases
- Based on features, e.g., separating user info, followers, and photos into different servers.

Page 22: Partitioning Criteria

Methods of Partitioning
1. Key or hash-based partitioning.
2. List partitioning.
3. Round robin partitioning.

Page 23: Composite Partitioning Challenges

Problems Encountered
- Difficulty in joins across sharded databases can lead to inefficiencies.

Page 24: Denormalization Impacts

Consequences of Denormalization
- Helps with efficiency but may create data inconsistencies and integrity challenges.

Page 25: Rebalancing Shards

Indicators for Change
- Need arises due to non-uniform distribution of data and requests.

Page 26: Indexing Benefits

Purpose of Indexes
- Improves retrieval speed at the cost of increased storage overhead.

Page 27: Handling Requests Efficiently

Handling Loads
- Combine requests to minimize heavy load on backend systems.

Page 28: Queue Mechanism Overview

Queueing in Distributed Systems
- Manages requests efficiently while ensuring asynchronous performance.

Page 29: Queue Functionality and QoS

Quality of Service
- Ensures clients are protected from service outages and guarantees proper request handling.

Page 30: Consistent Hashing Explanation

Overview of Consistent Hashing
- Key for scalable distributed systems; minimizes reorganization during scaling.

Page 31: Implementation of Consistent Hashing

How It Works
- Concepts of mapping servers and keys on a circular hash ring, allowing efficient node addition/removal.

Page 32: Key-Server Mapping Strategy

Mapping Outline
- Assigning keys to servers through hash functions ensuring minimal re-mapping.

Page 33: Dynamic Server Management

Keys Reassignment on Server Changes
- Strategies for maintaining key assignment efficiency when servers are added or removed.

Page 34: Virtual Replicas for Load Balancing

Distribution Improvements
- Mapping to multiple points improves load balancing through increased replicas.

Page 35: Long Polling vs Web Sockets

Communication Protocols
- Understanding the need for efficient and real-time data communication techniques.

Page 36: Long Polling Mechanism

Detailed Functionality
- How long polling differs from standard HTTP requests and its operational logic.

Page 37: Web Sockets Benefits

Persistent Communication
- Overview of the advantages of bidirectional communication channels through WebSockets.

Page 38: Server-Sent Events (SSE)

Long-term Connections
- Establishes persistent connections for real-time data transfer from server to client.