1/82
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Parallel Computer Architecture
The method of organizing resources to maximize performance and programmability within given limits.
Unified Memory Access
Model where all processors share physical memory uniformly, with equal access time.
Symmetric Multiprocessor
All processes have equal access to peripheral devices in a system.
Asymmetric Multiprocessor
Only one or a few processors can access peripheral devices in a system.
Peripheral Device
Devices like printers, mice, scanners, and keyboards that can transfer data to or from memory without involving the processor.
No Uniform Memory Access
Model where memory access time varies with the location of the memory itself.
Local Memory
Memory physically distributed among all processors, each processor has its own local memory.
Distributed Memory Multicomputer System
Consists of multiple computers (nodes) interconnected by a message passing network, each node has its own processor, local memory, and I/O devices.
Pipelining
Technique that divides a task into smaller subtasks and assigns them to different processors to work concurrently, improving performance and efficiency.
Parallelism by Multiple Functional Units
The number of functional units that can efficiently be utilized is restricted by data dependencies between neighboring instructions
Superscalar Processors
Dependencies determined dynamically at runtime by hardware, instructions dispatched to instruction units using dynamic scheduling.
Parallelism at Process or Thread Level
System where each core of a multicore processor must obtain a separate flow of control, accessing the same memory and sharing caches, requiring coordination of memory accesses.
Memory System Parallelism
Increasing the number of memory units and communication bandwidth.
Communication Parallelism
Increasing the amount of interconnections between elements and communication bandwidth.
Dataflow Architectures
Architecture that processes data based on availability and dependencies of data rather than sequential order of instructions, hard to build correctly.
Coherence
Writes to a location become visible to all processors in the same order, implemented with a hardware protocol based on the model of memory consistency.
Sequential Consistency
Ensures the order of operations executed by different processes appears consistent with a global order of execution and order of operations on each individual process.
ACID Transactions
Atomicity, Consistency, Isolation, and Durability ensure the integrity and reliability of database transactions.
Distributed Memory Systems
Memory architecture where physically separated memory can be addressed as a single shared address space.
Page Based Approach
Uses virtual memory to map pages of shared data to the local memory of each processor.
Shared Variable Approach
Uses routines to access shared variables distributed across processors.
Object Based Approach
Uses objects as units of data distribution and access, with each object having a set of methods that can be performed on processors.
Components of Interconnection Networks
Links, Switches, Network Interfaces.
Direct Connection Networks
Fixed point-to-point connections between neighboring nodes with fixed topology, such as rings, meshes, and cubes.
Indirect Connection Networks
Communication topology can change dynamically based on application demands, such as bus networks, multistage networks, and crossbar switches.
Routing
Determines the path from source to destination and how packets are routed.
Dimension Order Routing
Limits legal paths to have exactly one route from each source to each destination.
Deterministic Routing
Route taken by a message is determined exclusively by its source and destination, not by other traffic in the network.
Minimal Routing Algorithm
Selects the shortest route toward the destination of the message.
Domain Name System (DNS)
Translates domain names to IP addresses for browsers to load internet resources.
IP Address
Unique ID for a device connected to the internet, allows browsers to interact.
DNS Recursor
Server designed to receive queries from client machines through web browsers, responsible for making additional requests to satisfy the query.
Root Nameserver
First step in translating human-readable host names into IP addresses.
Top Level Domain (TLD) Nameserver
Hosts the last portion of a hostname, such as ".com" in "example.com".
Authoritative Nameserver
Returns the IP address for the requested hostname to the DNS recursor if it has access to the requested record.
Transmission Control Protocol (TCP)
Provides reliable, ordered, and error-checked delivery of a stream of bytes between applications running on hosts communicating via an IP network.
User Datagram Protocol (UDP)
Communications protocol used to establish low latency and loss-tolerating connections between applications on the internet, enabling faster transmission.
Open Systems Interconnection (OSI) Model
Conceptual model enabling diverse communication systems to communicate using standard protocols.
Distributed System
Collection of interconnected computers working together to achieve a common goal, processing and storing data, performing computations, and providing services across multiple machines.
MapReduce
A programming model or pattern within the Hadoop framework used to access big data in the Hadoop File System (HDFS).
Hadoop
A technology that gives companies the ability to store and process huge amounts of data, it is a distributed file system
Apache Spark
A multi-language engine for executing data engineering, data science, and machine learning on single node machines or clusters
Map
Splits data into smaller blocks and assigns them to mappers for processing.
Reduce
Maps output values with the same key are assigned to a single reducer.
Combine
(Optional) A reducer that runs individually on each mapper server.
Partition
Translates key-value pairs from mappers to another set of key-value pairs to feed into the reducer. It decides how the data has to be presented.
YARN
Goal is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons.
Container
Holds physical resources like a disk on a single node, CPU cores, or RAM.
Container Launch Context (CLC)
Contains records of dependencies, security tokens, environment variables.
Application Master
Posts CLC by requesting the container from the node manager.
Node Manager
Takes care of individual nodes in the Hadoop cluster and manages containers related to each node. It is registered to the Resource Manager and sends each node's health status.
Resource Manager
The master daemon of YARN and assigns resources.
Scheduler
Responsible for allocating resources to various applications subject to familiar constraints of capacities, queues, etc.
Applications Manager
Responsible for accepting job submissions, negotiating with the first container for executing the application-specific ApplicationMaster, and provides the service for restarting the ApplicationMaster container on failure.
Multi-tenancy
Allows access to multiple data processing engines.
Pipelining Steps
Fetch, decode, execute, write back
Data Parallelism
Increases the amount of data to operated on at the same time
Processor Parallelism
Increases the amount of processors
Atomicity (ACID)
Entire transaction takes place at once
Consistency (ACID)
Database must be consistent before and after the transaction
Isolation (ACID)
Multiple transactions occur independently without interference
Durability (ACID)
Changes of a successful transaction occurs even if there is a system failure
Links
A cable of one or more optical fibers or electrical wires that transmits analog signals from one end to the other to obtain the original digital information
Switches
Composes of a set of I/O ports, an internal cross bar connecting input to output, internal buffers, and they control the logic to affect the I/O connection at each point in time
Network Interfaces
Formats the packets and constructs the routing and controls information, may check end to end error and flow control
Topology
The pattern to connect the individual switches to other elements like processors, memories, and other switches
Concurrency of Distributed Systems
Distributed systems leverage concurrency and parallelism to improve performance and throughput
Redundancy
Multiple copies of data or services are maintained to ensure availability in case of failures
Fault Tolerance
Mechanisms that include replication and data recovery techniques
Client Server Architecture
Clients request services or resources from central servers and central servers handle data processing and storage
Peer to Peer Architecture
Allows distributed nodes (peers) to act as both clients and servers where peers share resources directly without a central server
Microservices Architecture
Where an application is composed of small, independent services that focus on a specific function and communicates through APIs, common in cloud based applications
Distributed Storage Systems
NoSQL, Cassandra, MongoDB, Hadoop Distributed File System (HDFS) for big data all use distributed storage
Content Delivery Network
Uses geographical distribution to reduce latency and improve user experience
Driver
Converts the user’s code into multiple tasks that can be distributed across worker nodes
Executors
Run on the worker nodes and execute the tasks assigned to them
Complex Problems
Often require exponential time to solve, making them impractical for large datasets
P Problems
Problems that are solvable in polynomial time, they have a predictable execution time related to the input size
NP Problems
Problems whose validity can be verified in polynomial time
Decision Problems
Involve determining a binary outcome based on the input
Optimization Problems
Seek the best solution from a set of feasible solutions
Nick’s Class
Represents problems efficiently solvable in parallel, emphasizing low depth circuits
Algorithmic Complexity
Refers to the efficiency problem of algorithms in terms of time and space requirements. Assesses how the performance of an algorithm scales with an increase in input size