Parallel Final Notes

0.0(0)

Studied by 11 people

View linked note

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/82

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

83 Terms

New cards

Parallel Computer Architecture

The method of organizing resources to maximize performance and programmability within given limits.

New cards

Unified Memory Access

Model where all processors share physical memory uniformly, with equal access time.

New cards

Symmetric Multiprocessor

All processes have equal access to peripheral devices in a system.

New cards

Asymmetric Multiprocessor

Only one or a few processors can access peripheral devices in a system.

New cards

Peripheral Device

Devices like printers, mice, scanners, and keyboards that can transfer data to or from memory without involving the processor.

New cards

No Uniform Memory Access

Model where memory access time varies with the location of the memory itself.

New cards

Local Memory

Memory physically distributed among all processors, each processor has its own local memory.

New cards

Distributed Memory Multicomputer System

Consists of multiple computers (nodes) interconnected by a message passing network, each node has its own processor, local memory, and I/O devices.

New cards

Pipelining

Technique that divides a task into smaller subtasks and assigns them to different processors to work concurrently, improving performance and efficiency.

New cards

Parallelism by Multiple Functional Units

The number of functional units that can efficiently be utilized is restricted by data dependencies between neighboring instructions

New cards

Superscalar Processors

Dependencies determined dynamically at runtime by hardware, instructions dispatched to instruction units using dynamic scheduling.

New cards

Parallelism at Process or Thread Level

System where each core of a multicore processor must obtain a separate flow of control, accessing the same memory and sharing caches, requiring coordination of memory accesses.

New cards

Memory System Parallelism

Increasing the number of memory units and communication bandwidth.

New cards

Communication Parallelism

Increasing the amount of interconnections between elements and communication bandwidth.

New cards

Dataflow Architectures

Architecture that processes data based on availability and dependencies of data rather than sequential order of instructions, hard to build correctly.

New cards

Coherence

Writes to a location become visible to all processors in the same order, implemented with a hardware protocol based on the model of memory consistency.

New cards

Sequential Consistency

Ensures the order of operations executed by different processes appears consistent with a global order of execution and order of operations on each individual process.

New cards

ACID Transactions

Atomicity, Consistency, Isolation, and Durability ensure the integrity and reliability of database transactions.

New cards

Distributed Memory Systems

Memory architecture where physically separated memory can be addressed as a single shared address space.

New cards

Page Based Approach

Uses virtual memory to map pages of shared data to the local memory of each processor.

New cards

Shared Variable Approach

Uses routines to access shared variables distributed across processors.

New cards

Object Based Approach

Uses objects as units of data distribution and access, with each object having a set of methods that can be performed on processors.

New cards

Components of Interconnection Networks

Links, Switches, Network Interfaces.

New cards

Direct Connection Networks

Fixed point-to-point connections between neighboring nodes with fixed topology, such as rings, meshes, and cubes.

New cards

Indirect Connection Networks

Communication topology can change dynamically based on application demands, such as bus networks, multistage networks, and crossbar switches.

New cards

Routing

Determines the path from source to destination and how packets are routed.

New cards

Dimension Order Routing

Limits legal paths to have exactly one route from each source to each destination.

New cards

Deterministic Routing

Route taken by a message is determined exclusively by its source and destination, not by other traffic in the network.

New cards

Minimal Routing Algorithm

Selects the shortest route toward the destination of the message.

New cards

Domain Name System (DNS)

Translates domain names to IP addresses for browsers to load internet resources.

New cards

IP Address

Unique ID for a device connected to the internet, allows browsers to interact.

New cards

DNS Recursor

Server designed to receive queries from client machines through web browsers, responsible for making additional requests to satisfy the query.

New cards

Root Nameserver

First step in translating human-readable host names into IP addresses.

New cards

Top Level Domain (TLD) Nameserver

Hosts the last portion of a hostname, such as ".com" in "example.com".

New cards

Authoritative Nameserver

Returns the IP address for the requested hostname to the DNS recursor if it has access to the requested record.

New cards

Transmission Control Protocol (TCP)

Provides reliable, ordered, and error-checked delivery of a stream of bytes between applications running on hosts communicating via an IP network.

New cards

User Datagram Protocol (UDP)

Communications protocol used to establish low latency and loss-tolerating connections between applications on the internet, enabling faster transmission.

New cards

Open Systems Interconnection (OSI) Model

Conceptual model enabling diverse communication systems to communicate using standard protocols.

New cards

Distributed System

Collection of interconnected computers working together to achieve a common goal, processing and storing data, performing computations, and providing services across multiple machines.

New cards

MapReduce

A programming model or pattern within the Hadoop framework used to access big data in the Hadoop File System (HDFS).

New cards

Hadoop

A technology that gives companies the ability to store and process huge amounts of data, it is a distributed file system

New cards

Apache Spark

A multi-language engine for executing data engineering, data science, and machine learning on single node machines or clusters

New cards

Map

Splits data into smaller blocks and assigns them to mappers for processing.

New cards

Reduce

Maps output values with the same key are assigned to a single reducer.

New cards

Combine

(Optional) A reducer that runs individually on each mapper server.

New cards

Partition

Translates key-value pairs from mappers to another set of key-value pairs to feed into the reducer. It decides how the data has to be presented.

New cards

YARN

Goal is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons.

New cards

Container

Holds physical resources like a disk on a single node, CPU cores, or RAM.

New cards

Container Launch Context (CLC)

Contains records of dependencies, security tokens, environment variables.

New cards

Application Master

Posts CLC by requesting the container from the node manager.

New cards

Node Manager

Takes care of individual nodes in the Hadoop cluster and manages containers related to each node. It is registered to the Resource Manager and sends each node's health status.

New cards

Resource Manager

The master daemon of YARN and assigns resources.

New cards

Scheduler

Responsible for allocating resources to various applications subject to familiar constraints of capacities, queues, etc.

New cards

Applications Manager

Responsible for accepting job submissions, negotiating with the first container for executing the application-specific ApplicationMaster, and provides the service for restarting the ApplicationMaster container on failure.

New cards

Multi-tenancy

Allows access to multiple data processing engines.

New cards

Pipelining Steps

Fetch, decode, execute, write back

New cards

Data Parallelism

Increases the amount of data to operated on at the same time

New cards

Processor Parallelism

Increases the amount of processors

New cards

Atomicity (ACID)

Entire transaction takes place at once

New cards

Consistency (ACID)

Database must be consistent before and after the transaction

New cards

Isolation (ACID)

Multiple transactions occur independently without interference

New cards

Durability (ACID)

Changes of a successful transaction occurs even if there is a system failure

New cards

Links

A cable of one or more optical fibers or electrical wires that transmits analog signals from one end to the other to obtain the original digital information

New cards

Switches

Composes of a set of I/O ports, an internal cross bar connecting input to output, internal buffers, and they control the logic to affect the I/O connection at each point in time

New cards

Network Interfaces

Formats the packets and constructs the routing and controls information, may check end to end error and flow control

New cards

Topology

The pattern to connect the individual switches to other elements like processors, memories, and other switches

New cards

Concurrency of Distributed Systems

Distributed systems leverage concurrency and parallelism to improve performance and throughput

New cards

Redundancy

Multiple copies of data or services are maintained to ensure availability in case of failures

New cards

Fault Tolerance

Mechanisms that include replication and data recovery techniques

New cards

Client Server Architecture

Clients request services or resources from central servers and central servers handle data processing and storage

New cards

Peer to Peer Architecture

Allows distributed nodes (peers) to act as both clients and servers where peers share resources directly without a central server

New cards

Microservices Architecture

Where an application is composed of small, independent services that focus on a specific function and communicates through APIs, common in cloud based applications

New cards

Distributed Storage Systems

NoSQL, Cassandra, MongoDB, Hadoop Distributed File System (HDFS) for big data all use distributed storage

New cards

Content Delivery Network

Uses geographical distribution to reduce latency and improve user experience

New cards

Driver

Converts the user’s code into multiple tasks that can be distributed across worker nodes

New cards

Executors

Run on the worker nodes and execute the tasks assigned to them

New cards

Complex Problems

Often require exponential time to solve, making them impractical for large datasets

New cards

P Problems

Problems that are solvable in polynomial time, they have a predictable execution time related to the input size

New cards

NP Problems

Problems whose validity can be verified in polynomial time

New cards

Decision Problems

Involve determining a binary outcome based on the input

New cards

Optimization Problems

Seek the best solution from a set of feasible solutions

New cards

Nick’s Class

Represents problems efficiently solvable in parallel, emphasizing low depth circuits

New cards

Algorithmic Complexity

Refers to the efficiency problem of algorithms in terms of time and space requirements. Assesses how the performance of an algorithm scales with an increase in input size