Parallel Final Theory

0.0(0)

Studied by 2 people

View linked note

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/51

There's no tags or description

Looks like no tags are added yet.

Last updated 4:33 PM on 12/18/23

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

52 Terms

New cards

Executable File Access

The ability of the operating system to locate executable files or scripts without specifying the full path.

New cards

Convenience

Simplifies command-line usage and makes it more user-friendly by allowing programs or scripts to be run from any location.

New cards

Efficiency

Saves time and reduces navigation complexity by allowing easy access to frequently used tools.

New cards

Script execution

The ability to run scripts that rely on executable variables in various directories by including those directories in the path.

New cards

Customization

The ability to customize the path to include directories relevant to specific needs, extending functionality.

New cards

Environmental portability

The ability for software to run on different systems without modification by relying on standard tools located in directories specified in the path.

New cards

Environmental variable

A variable that allows users and system administrators to configure and customize the behavior of applications and the operating system.

New cards

Configuration and customization

The ability to configure and customize the behavior of applications and the operating system using environmental variables.

New cards

Portability

The ability for software to adapt to different environments and systems without modification by using environmental variables instead of hardcoding file paths.

New cards

Ease of Management

The ability to centrally manage critical system configurations using system-wide environmental variables, simplifying the process of updating.

New cards

Consistency and Standardization

Easier to ensure that software behaves predictably across different environments by using environmental variables to standardize configurations.

New cards

Unified memory access

All processors share the physical memory uniformly and have equal access time, also known as symmetric multiprocessor.

New cards

Non-uniform memory access

Access time varies with the location of the memory word, with shared memory being physically distributed as local memories.

New cards

Distributed memory multicomputers

Consist of multiple computers known as nodes, interconnected by a message passing network, with each node acting as an autonomous computer with private local memories.

New cards

Parallelism by pipeline

Overlapping the execution of multiple instructions by dividing them into different steps performed by dedicated hardware.

New cards

Parallelism by multiple functional units

Increasing the number of functional units to perform operations simultaneously, limited by data dependencies.

New cards

Parallelism at process or thread level

Each core must obtain a separate flow of control, coordinating memory accesses and sharing resources.

New cards

Getting parallelism in the hardware

Achieving parallelism through instruction level parallelism, data parallelism, processor parallelism, memory system parallelism, and communication parallelism.

New cards

Dataflow Architecture

Representing computations as a graph of dependencies, storing operations in memories until operands are ready, and dispatching operations to processors.

New cards

Memory consistency

The coherence of memory operations, ensuring that writes to a location are visible to all processors, and establishing orders between writes and reads using event synchronization.

New cards

Sequential Consistency

Achieving a total order by interleaving accesses from different processes, maintaining program order and automatically completing memory operations with respect to others.

New cards

ACID

An acronym that stands for Atomicity, Consistency, Isolation, and Durability, defining properties of a transaction in a database system.

New cards

Distributed Memory Systems

A form of memory architecture where physically separated memories can be addressed as a single shared address space.

New cards

Page-based approach

Using virtual memory to address physically separated memories in a distributed memory system.

New cards

Shared variable approach

Accessing physically separated memories through routines that access shared variables or global variables.

New cards

Object Based Approach

Accessing physically separated memories in a distributed memory system through an object-oriented discipline.

New cards

Links

Cables with connectors at each end used to transmit analog signals from one end to the other in an interconnection network.

New cards

Switches

Components with input and output ports connected by an internal crossbar, allowing the exchange of data between processors in a parallel system.

New cards

Network interfaces

Components that behave differently from switch nodes and may be connected via special links, formatting packets and constructing routing and control information.

New cards

Interconnection Network

A network composed of switching elements that connect switches to other elements, enabling data exchange between processors in a parallel system.

New cards

Direct connection networks

Networks with point-to-point connections between neighboring nodes, where the connections are fixed and static.

New cards

Indirect connection networks

Networks without fixed neighbors, where the topology can be changed based on application demands, including bus networks, multistage networks, and crossbar switches.

New cards

Routing

The process of determining the path between a source and its destination in an interconnection network.

New cards

Deterministic Routing

Routing where the route taken is determined exclusively by the source and destination, without being influenced by other traffic.

New cards

Domain Name System (DNS)

The phonebook of the internet, translating domain names to IP addresses so that web browsers can load resources.

New cards

Loading a Webpage

The process of translating a domain name to an IP address using DNS recursors, root name servers, TLD nameservers, and authoritative names

New cards

MapReduce

A programming model used to access big data stored in HDFS, which facilitates concurrent processing by splitting data into smaller chunks and processing them in parallel.

New cards

Map

The process in MapReduce where data is divided into smaller chunks and each chunk is assigned to a mapper for processing.

New cards

Reduce

The process in MapReduce where the processed data is shuffled, sorted, and passed to reducers. All data with the same key is assigned to the same reducer, which aggregates them.

New cards

Combine

An optional step in MapReduce where a reducer runs individually to further reduce the data on each mapper, making shuffling and sorting easier.

New cards

Partition

The default partitioner in MapReduce determines the hash value and links keys with values. It creates as many partitions as there are reducers.

New cards

YARN

A framework in Hadoop that splits the functionalities of a resource management job into separate daemons, including a global resource manager and application master.

New cards

Container

A resource, such as a disk, on a single node in YARN. Containers are invoked using the Container Launch Context (CLC).

New cards

Application Master

In YARN, when a single job is submitted, it is called an application. The application master is responsible for monitoring certain aspects of the application and fulfilling its requirements by sending the CLC.

New cards

Node Manager

In YARN, the node manager takes care of individual nodes and their containers. It manages the containers assigned by the resource manager and creates and runs process containers when requested.

New cards

Resource Manager

In YARN, the resource manager is the master daemon responsible for resource management and assignment of all the applications. It forwards requests to the corresponding node managers and allocates resources.

New cards

Scheduler

In YARN, the scheduler allocates resources to the various running applications based on familiar constraints.

New cards

Applications Manager

In YARN, the applications manager is responsible for accepting job submissions, negotiating the first container, and providing services for restarting application manager containers on failure.

New cards

Multi Tenancy

A feature of YARN that allows access to multiple data processing engines, enabling the running of different types of distributed applications rather than just MapReduce.

New cards

Scalability

A feature of YARN that allows for the utilization of many nodes and clusters, making it suitable for large-scale data processing.

New cards

Compatibility

A feature of YARN that allows it to be used with various versions of Hadoop, ensuring compatibility with different environments.

New cards

AM (Application Master) Lifecycle

The steps involved in the lifecycle of an application in YARN, including the allocation of a container by the resource manager, registration of the AM with the RM, negotiation between the AM and RM regarding the container, launching of the container by the node manager, execution of the application code in the container, monitoring of the application status by the AM or PM, and un-registration of the AM with the RM after the process is complete.