Power System Design & Practice

Page 1: System Design Basics

  • Problem Decomposition

    1. Break down the problem into simpler modules (Top down approach).

    2. Discuss trade-offs: No solution is perfect; evaluate the impact based on constraints and end test cases.

  • Interview Focus

    • Understand the interviewer’s intentions.

    • Ask abstract questions regarding constraints and functionality requirements.

  • Bottlenecks

    • Identify potential bottlenecks in the system.

Page 2: Architectural Insights

  • Resources Overview

    1. Understand the architectural pieces and resources available.

    2. Explore how these resources work together.

  • Utilization and Trade-offs

    • Explore consistent hashing, CAP theorem, load balancing, queues, caching, replication vs NoSQL relationships, indexes, proxies, and data partitioning.

Page 3: Load Balancing

  • Distributed Systems

    • Types of load distribution:

      • Random

      • Round-robin (considering weights for memory and CPU cycles).

  • Scalability

    • Achieve full scalability and redundancy by distributing load across user web servers, app servers, and internal cache servers.

Page 4: Smart Clients

  • Functionality

    • Smart clients balance load among service hosts and detect non-responsive hosts.

    • They recover hosts and add new ones, enhancing database load balancing (cache and service).

  • Scalability Options

    • Attractive solution for developers in small-scale systems; can grow with the system.

    • Consider hardware load balancers (high performance, costly, not trivial to configure).

Page 5: Software Load Balancers

  • Advantages

    • No need for creation of smart clients or additional hardware costs.

    • Using hybrid approaches like HA Proxy offers efficient management of requests.

  • Deployment Scenarios

    1. Running on client machines (e.g., localhost).

    2. Running on intermediate servers to manage health checks and request balances.

Page 6: Databases Overview

  • Types of Databases

    1. Structured vs Unstructured.

    2. Predefined schema vs Dynamic schema.

  • Data Storage Approaches

    • Data in rows & columns versus separate data points in a dynamic schema.

    • Examples: MySQL, Oracle, Postgres, MariaDB.

Page 7: (Content Not Provided)

Page 8: (Content Not Provided)

Page 9: Reasons to Use ACID Compliant Databases

  • Importance of ACID Compliance

    • Reduces anomalies and maintains database integrity, critical for e-commerce and financial applications.

  • Data Stability

    • Suitable for structured and stable data with no rapid growth or changes.

Page 10: Reasons to Use NoSQL

  • Performance Solutions

    • Solutions to prevent bottlenecks during querying and searching for large volumes of data.

  • Cost-effectiveness

    • Good for storing large volumes with less structure and utilizing cloud and commodity hardware.

    • Excellent for rapid and agile development (schema changes).

Page 11: CAP Theorem Insights

  • CAP Theorem

    • Requires that all nodes see the same data and allows for updates to several nodes at the same time.

  • Availability

    • Every request must receive a response (success/failure) despite potential message loss.

Page 12: Limitations of Designing a Datastore

  • Challenges

    • It’s impossible to ensure all three: constant availability, sequential consistency, and partition tolerance due to potential network partitions affecting data consistency.

Page 13: Redundancy & Replication

  • Benefits

    • Increases system reliability through duplication of critical data and services.

    • Provides backups and secures against single-node failures.

Page 14: Caching Fundamentals

  • Load Balancing and Caching

    • Achieves horizontal scalability and minimizes latency in data retrieval.

    • Different cache strategies including global and distributed caches.

Page 15: Distributed Caching

  • Consistent Hashing

    • To efficiently manage caches by distributing keys across nodes.

    • Challenges with node disappearance and request management.

Page 16: Centralized Cache Challenges

  • Challenges

    • Managing a single cache space can be complicated if usage spikes.

    • Global cache forms to streamline data retrieval but require effective eviction strategies.

Page 17: CDN (Content Distribution Networks)

  • Purpose

    • Helps save large amounts of static media across sites.

    • Handles case scenarios with variable media availability.

Page 18: Cache Invalidation Strategies

  • Need for Coherence

    • Cached data must remain coherent with database modifications.

    • Different strategies: Write-through, Write-around, Write-back caching.

Page 19: Cache Eviction Policies

  • Policies

    • Methods include FIFO, LIFO, FILO, LRU, and Random Replacement.

Page 20: Sharding and Data Partitioning

  • Understanding Sharding

    • Refers to splitting tables across machines to enhance management, performance, and availability.

    • Different methods: horizontal and vertical partitioning.

Page 21: Vertical Partitioning

  • Use Cases

    • Based on features, e.g., separating user info, followers, and photos into different servers.

Page 22: Partitioning Criteria

  • Methods of Partitioning

    1. Key or hash-based partitioning.

    2. List partitioning.

    3. Round robin partitioning.

Page 23: Composite Partitioning Challenges

  • Problems Encountered

    • Difficulty in joins across sharded databases can lead to inefficiencies.

Page 24: Denormalization Impacts

  • Consequences of Denormalization

    • Helps with efficiency but may create data inconsistencies and integrity challenges.

Page 25: Rebalancing Shards

  • Indicators for Change

    • Need arises due to non-uniform distribution of data and requests.

Page 26: Indexing Benefits

  • Purpose of Indexes

    • Improves retrieval speed at the cost of increased storage overhead.

Page 27: Handling Requests Efficiently

  • Handling Loads

    • Combine requests to minimize heavy load on backend systems.

Page 28: Queue Mechanism Overview

  • Queueing in Distributed Systems

    • Manages requests efficiently while ensuring asynchronous performance.

Page 29: Queue Functionality and QoS

  • Quality of Service

    • Ensures clients are protected from service outages and guarantees proper request handling.

Page 30: Consistent Hashing Explanation

  • Overview of Consistent Hashing

    • Key for scalable distributed systems; minimizes reorganization during scaling.

Page 31: Implementation of Consistent Hashing

  • How It Works

    • Concepts of mapping servers and keys on a circular hash ring, allowing efficient node addition/removal.

Page 32: Key-Server Mapping Strategy

  • Mapping Outline

    • Assigning keys to servers through hash functions ensuring minimal re-mapping.

Page 33: Dynamic Server Management

  • Keys Reassignment on Server Changes

    • Strategies for maintaining key assignment efficiency when servers are added or removed.

Page 34: Virtual Replicas for Load Balancing

  • Distribution Improvements

    • Mapping to multiple points improves load balancing through increased replicas.

Page 35: Long Polling vs Web Sockets

  • Communication Protocols

    • Understanding the need for efficient and real-time data communication techniques.

Page 36: Long Polling Mechanism

  • Detailed Functionality

    • How long polling differs from standard HTTP requests and its operational logic.

Page 37: Web Sockets Benefits

  • Persistent Communication

    • Overview of the advantages of bidirectional communication channels through WebSockets.

Page 38: Server-Sent Events (SSE)

  • Long-term Connections

    • Establishes persistent connections for real-time data transfer from server to client.