1/51
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Executable File Access
The ability of the operating system to locate executable files or scripts without specifying the full path.
Convenience
Simplifies command-line usage and makes it more user-friendly by allowing programs or scripts to be run from any location.
Efficiency
Saves time and reduces navigation complexity by allowing easy access to frequently used tools.
Script execution
The ability to run scripts that rely on executable variables in various directories by including those directories in the path.
Customization
The ability to customize the path to include directories relevant to specific needs, extending functionality.
Environmental portability
The ability for software to run on different systems without modification by relying on standard tools located in directories specified in the path.
Environmental variable
A variable that allows users and system administrators to configure and customize the behavior of applications and the operating system.
Configuration and customization
The ability to configure and customize the behavior of applications and the operating system using environmental variables.
Portability
The ability for software to adapt to different environments and systems without modification by using environmental variables instead of hardcoding file paths.
Ease of Management
The ability to centrally manage critical system configurations using system-wide environmental variables, simplifying the process of updating.
Consistency and Standardization
Easier to ensure that software behaves predictably across different environments by using environmental variables to standardize configurations.
Unified memory access
All processors share the physical memory uniformly and have equal access time, also known as symmetric multiprocessor.
Non-uniform memory access
Access time varies with the location of the memory word, with shared memory being physically distributed as local memories.
Distributed memory multicomputers
Consist of multiple computers known as nodes, interconnected by a message passing network, with each node acting as an autonomous computer with private local memories.
Parallelism by pipeline
Overlapping the execution of multiple instructions by dividing them into different steps performed by dedicated hardware.
Parallelism by multiple functional units
Increasing the number of functional units to perform operations simultaneously, limited by data dependencies.
Parallelism at process or thread level
Each core must obtain a separate flow of control, coordinating memory accesses and sharing resources.
Getting parallelism in the hardware
Achieving parallelism through instruction level parallelism, data parallelism, processor parallelism, memory system parallelism, and communication parallelism.
Dataflow Architecture
Representing computations as a graph of dependencies, storing operations in memories until operands are ready, and dispatching operations to processors.
Memory consistency
The coherence of memory operations, ensuring that writes to a location are visible to all processors, and establishing orders between writes and reads using event synchronization.
Sequential Consistency
Achieving a total order by interleaving accesses from different processes, maintaining program order and automatically completing memory operations with respect to others.
ACID
An acronym that stands for Atomicity, Consistency, Isolation, and Durability, defining properties of a transaction in a database system.
Distributed Memory Systems
A form of memory architecture where physically separated memories can be addressed as a single shared address space.
Page-based approach
Using virtual memory to address physically separated memories in a distributed memory system.
Shared variable approach
Accessing physically separated memories through routines that access shared variables or global variables.
Object Based Approach
Accessing physically separated memories in a distributed memory system through an object-oriented discipline.
Links
Cables with connectors at each end used to transmit analog signals from one end to the other in an interconnection network.
Switches
Components with input and output ports connected by an internal crossbar, allowing the exchange of data between processors in a parallel system.
Network interfaces
Components that behave differently from switch nodes and may be connected via special links, formatting packets and constructing routing and control information.
Interconnection Network
A network composed of switching elements that connect switches to other elements, enabling data exchange between processors in a parallel system.
Direct connection networks
Networks with point-to-point connections between neighboring nodes, where the connections are fixed and static.
Indirect connection networks
Networks without fixed neighbors, where the topology can be changed based on application demands, including bus networks, multistage networks, and crossbar switches.
Routing
The process of determining the path between a source and its destination in an interconnection network.
Deterministic Routing
Routing where the route taken is determined exclusively by the source and destination, without being influenced by other traffic.
Domain Name System (DNS)
The phonebook of the internet, translating domain names to IP addresses so that web browsers can load resources.
Loading a Webpage
The process of translating a domain name to an IP address using DNS recursors, root name servers, TLD nameservers, and authoritative names
MapReduce
A programming model used to access big data stored in HDFS, which facilitates concurrent processing by splitting data into smaller chunks and processing them in parallel.
Map
The process in MapReduce where data is divided into smaller chunks and each chunk is assigned to a mapper for processing.
Reduce
The process in MapReduce where the processed data is shuffled, sorted, and passed to reducers. All data with the same key is assigned to the same reducer, which aggregates them.
Combine
An optional step in MapReduce where a reducer runs individually to further reduce the data on each mapper, making shuffling and sorting easier.
Partition
The default partitioner in MapReduce determines the hash value and links keys with values. It creates as many partitions as there are reducers.
YARN
A framework in Hadoop that splits the functionalities of a resource management job into separate daemons, including a global resource manager and application master.
Container
A resource, such as a disk, on a single node in YARN. Containers are invoked using the Container Launch Context (CLC).
Application Master
In YARN, when a single job is submitted, it is called an application. The application master is responsible for monitoring certain aspects of the application and fulfilling its requirements by sending the CLC.
Node Manager
In YARN, the node manager takes care of individual nodes and their containers. It manages the containers assigned by the resource manager and creates and runs process containers when requested.
Resource Manager
In YARN, the resource manager is the master daemon responsible for resource management and assignment of all the applications. It forwards requests to the corresponding node managers and allocates resources.
Scheduler
In YARN, the scheduler allocates resources to the various running applications based on familiar constraints.
Applications Manager
In YARN, the applications manager is responsible for accepting job submissions, negotiating the first container, and providing services for restarting application manager containers on failure.
Multi Tenancy
A feature of YARN that allows access to multiple data processing engines, enabling the running of different types of distributed applications rather than just MapReduce.
Scalability
A feature of YARN that allows for the utilization of many nodes and clusters, making it suitable for large-scale data processing.
Compatibility
A feature of YARN that allows it to be used with various versions of Hadoop, ensuring compatibility with different environments.
AM (Application Master) Lifecycle
The steps involved in the lifecycle of an application in YARN, including the allocation of a container by the resource manager, registration of the AM with the RM, negotiation between the AM and RM regarding the container, launching of the container by the node manager, execution of the application code in the container, monitoring of the application status by the AM or PM, and un-registration of the AM with the RM after the process is complete.