Note

0.0(0)

Chat with Kai

undefined Flashcards

0 Cards0.0(0)

Explore Top Notes

unit four review: political ideologies and beliefs

Note

Studied by 48 people

5.0(1)

Waste Generation and Management

Chapter 4: Computer Systems and Networks

Note

Studied by 37 people

5.0(1)

Chapter 35: Life and Health Insurance

Note

Studied by 19 people

5.0(1)

Parallel Final Notes

Session 16: Parallel Computer Architecture

Parallel Computer Architecture → The method of organizing all the resources to maximize the performance and programmability within the limits given by technology and cost at any instance of time

Unified Memory Access → Model where all the processor share the physical memory uniformly

They each have equal access time to all memory
Symmetric Multiprocessor → When all the process have equal access to all the peripheral devices in a system
Asymmetric Multiprocessor → When only one or a few processors can access the peripheral devices in a system
Peripheral Device → Can transfer data to or from memory without involving the processor
- Ex) printer, mouse, scanner, keyboard

No Uniform Memory Access → Model where memory access time varies with the location of the memory itself

Local Memory → When memory is physically distributed among all the processors
Each processor has its own local memory which it can access faster than non local memory

Distributed Memory Multicomputer System → Consists of multiple computers (nodes) that are interconnected by a message passing network

Each node acts autonomously, possessing a processor, local memory, and I/O devices
- All local memories are private and only accessible to the local processors

Pipelining → A technique that divides a task into smaller subtasks and assigning them to different processors that can work concurrently

Improves performance and efficiency
Steps:

Fetch → Fetch the next instruction to be executed from memory
Decode → Decode the instructions
Execute → Perform the actual operation given by the instructions
Write Back → Write the result to the target register

Parallelism by Multiple Functional Units:

The number of functional units that can efficiently be utilized is restrict by data dependencies between neighboring instructions
Superscalar Processors → Dependencies are determined at runtime dynamically by the hardware and decoding instructions are dispatched to instruction units using dynamic scheduling

Parallelism at Process or Thread Level → System where each core of a multicore processor must obtain a separate flow of control

Each core access the same memory and share caches
- Memory accesses must be coordinated

Parallelism in Hardware:

Data Parallelism → Increases the amount of data to operated on at the same time
Processor Parallelism → Increases the amount of processors
Memory System Parallelism → Increases the number of memory units and increases communication bandwidth
Communication Parallelism → Increases the amount of interconnections between elements and increases communication bandwidth

Dataflow Architectures → Architecture that processes data based on the availability and dependencies of the data rather than the sequential order of instructions

Hard to build correctly

Coherence → Implies that writes to a location become visible to all processors in the same order

Implement hardware protocol based on model of memory consistency

Sequential Consistency → Ensures that the order in which operations are executed by different processes in the system appears to be consistent with a global order of execution, and that this order is also consistent with the order of operations on each individual process

Conditions:

Each process issues memory operations in program order
The issuing process waits for the write to be complete before issuing the next operation

ACID Transactions:

Atomicity → Entire transaction takes place at once
Consistency → Database must be consistent before and after the transaction
Isolation → Multiple transactions occur independently without interference
Durability → Changes of a successful transaction occurs even if there is a system failure

Session 17: Distributed Memory Systems

Distributed Memory Systems → A form of memory architecture where physically separated memory can be addresses as a single shared address space

The same physical address on two processors refers to the same location in memory
Page Based Approach → Uses virtual memory to map pages of shared data to the local memory of each processor
Shared Variable Approach → Uses routines to access shared variables that are distributed across the processors
Object Based Approach → Uses objects as the units of data distribution and access with each object having a set of methods that can be performed on the processors

Session 18: Interconnection Network Design

Components of Interconnection Networks:

Links → A cable of one or more optical fibers or electrical wires that transmits analog signals from one end to the other to obtain the original digital information
Switches → Composes of a set of I/O ports, an internal cross bar connecting input to output, internal buffers, and they control the logic to affect the I/O connection at each point in time
Network Interfaces → Formats the packets and constructs the routing and controls information

May check end to end error and flow control

Topology → The pattern to connect the individual switches to other elements like processors, memories, and other switches

Types of Networks:

Direct Connection Networks → Fixed point to point connections between neighboring nodes

Fixed topology
Rings, meshes, cubes

Indirect Connection Networks → Communication topology can be change dynamically based on the application demands

Bus networks, multistage networks, crossbar switches

Routing → Determines which of the possible paths from source to destination is used as routes and how the route is followed by each particular packet is determined

Dimension Order Routing → Limits the set of legal paths so that there is exactly one route from each source to each destination

Deterministic Routing → The route taken by a message is determined exclusively by its source and destination and not by other traffic in the network

Minimal Routing Algorithm → Selects the shortest route toward the destination of the message

Domain Name System (DNS) → Translate domain names (i.e. nytimes.com) to IP addresses so browsers can load internet resources

IP Address → Unique ID for a device connected to the internet and allows browsers to interact

DNS Recursor → A server designed to receive queries from client machines through web browsers

Also responsible for making additional requests to satisfy the query

Root Nameserver → First step in translating humans readable host names into IP addresses

Top Level Domain (TLD ) Nameserver → The next step in the search for a specific IP address, it hosts the last portion of a hostname (TLD server in example.com is com)

Authoritative Nameserver → If it has access to the requested record, it will return the IP address for the requested hostname to the DNS recursor

Transmission Control Protocol (TCP) → Provides reliable, ordered, and error-checked delivery of a stream of bytes between applications running on hosts communicating via an IP network

User Datagram Protocol (UDP) → Communications protocol that is used to establish low latency and loss tolerating connections between applications on the internet

Speeds up transmission by enabling the transfer of data before an agreement is provided

Open Systems Interconnection (OSI) Model → Conceptual model that enables diverse communication systems to communicate using standard protocols

Session 19: Distributed Systems

Distributed System → A collection of interconnected computers that work together to achieve a common goal

Designed to process and store data, perform computations, and provide services across multiple machines
Offer solutions to challenges of scalability, fault tolerance, and performance

Decentralization → Reduces single points of failure and enhances fault tolerance

Heterogeneity in Distributed Systems:

Distributed systems often consist of diverse hardware and software components
- Components vary in processing power, operating systems, and programming languages
Can lead to compatibility and communication issues

Concurrency of Distributed Systems → Distributed systems leverage concurrency and parallelism to improve performance and throughput

Redundancy → Multiple copies of data or services are maintained to ensure availability in case of failures

Fault Tolerance → Mechanisms that include replication and data recovery techniques

Challenges and Considerations in Distributed Systems:

Communication Challenges

Network latency, packet loss, and need for reliable communication protocols

Consistency and Data Synchronization

Distributed systems often employ techniques like distributed transactions, quorum systems, and consensus algorithms to maintain data integrity

Security and Authentication

Security is super important in distributed systems and it requires authentication, authorization, and encryption components

Scalability and Load Balancing

Load balancing is the distribution of work evenly across nodes to ensure optimal utilization
- Complex if it is a large scale system

Client Server Architecture → Clients request services or resources from central servers and central servers handle data processing and storage

Peer to Peer Architecture → Allows distributed nodes (peers) to act as both clients and servers where peers share resources directly without a central server

Microservices Architecture → Where an application is composed of small, independent services that focus on a specific function and communicates through APIs

Common in cloud based applications

Distributed Storage Systems → NoSQL, Cassandra, MongoDB, Hadoop Distributed File System (HDFS) for big data all use distributed storage

Advantages of Distributed Systems:

Provide horizontal scalability to handle increased workloads
Using parallelism enhances performance
Withstand system failures
Redundancy provide fault tolerance
High availability through load balancing
Enable applications to serve users worldwide

Content Delivery Network (CDN) → Uses geographical distribution to reduce latency and improve user experience

Session 22: Big Data, Hadoop, Spark, & The Cloud

Big Data → Data that contains greater variety arriving in increasing volumes with ever higher velocity

3 Vs → Variety, volume, velocity

Velocity:

90% of the world’s data has been created in the last two yeas

Variety:

80% of data is unstructured
20% is structured

Advanced Analytic Capabilities:

data/text mining
machine learning
pattern matching
forecasting
semantic analysis
sentiment analysis
network and cluster analysis
multivariate statistics
neural networks

Hadoop → A technology that gives companies the ability to store and process huge amounts of data

Distributed File System
How it works:

Every file in the system is split into several blocks
Each block is stored in different data nodes
Each block is replicated 3 times (by default) to recover in case of data loss

Blocks help store large files and provide fault tolerance

Apache Spark → A multi-language engine for executing data engineering, data science, and machine learning on single node machines or clusters

Evolution of spark
Quickly performs processing tasks across multiple computers
Components:

Driver → Converts the user’s code into multiple tasks that can be distributed across worker nodes
Executors → Run on the worker nodes and execute the tasks assigned to them

Session 24: Design Patterns

Design Pattern → A reusable solution to a commonly occurring problem within a given context in software design

A template/blueprint that can be applied to solve a particular design problem in a way that is effective and efficient

Creational Patterns → Deal with the process of object creation

Structural Patterns → Focus on the composition of classes or objects

Behavioral Patterns → Define the ways in which objects interact and communicate with each other

Session 25: P Problems & NP Problems

Algorithmic Complexity → Refers to the efficiency problem of algorithms in terms of time and space requirements

Assesses how the performance of an algorithm scales with an increase in input size

Efficient Algorithms → Crucial for solving real world problems within reasonable time frames and resource constraints

Complex Problems → Often require exponential time to solve, making them impractical for large datasets

Balancing accuracy and efficiency becomes a significant challenge

Polynomial (P) Problems → Problems that are solvable in polynomial time

They have a predictable execution time related to the input size
Ex) Quick sort

Non Deterministic Polynomial (NP) Problems → Problems whose validity can be verified in polynomial time

While the solution’s correctness may not be found quickly, its validity can be checked in polynomial time

Decision Problems → Involve determining a binary outcome based on the input

Ex) Is there a path between two nodes?

Optimization Problems → Seek the best solution from a set of feasible solutions

Ex) Finding the shortest path between two nodes

Reducing Execution Time → Parallel algorithms can significantly reduce overall computation time for certain problems

Complex Problem Considerations:

Scalability → Algorithm must ensure efficiency as the problem size and available resources increase
Synchronization → Algorithm must be able to coordinate parallel tasks without causing bottlenecks

P-Complete Class →A class of problems that are as hard as the hardest problems in polynomial time

Implies a polynomial time solution for all problems in P

NP-Complete Class → A class of problems to which NP can be reduced in polynomial time

If a polynomial time algorithm exists for one NP complete problem, it implies a polynomial time algorithm for all problems in NP

P-Complete Significance → If any P-Complete problem has a polynomial time solution, it implies P equals NP

NP-Complete Significance → If any NP-Complete problem has a polynomial time solution, P equals NP

Nick’s Class (NC) → Represents problems efficiently solvable in parallel, emphasizing low depth circuits

P-Completeness in Parallel Computing → Alass of problems addressing the complexity of parallel algorithms

Parallelism in P → Problems in P can be efficiently solved in parallel

Parallelism in NP → More complex

Challenges:

NP problems are more complex as they lack known polynomial time solutions
Efficient parallel algorithms involve communication between processors

Minimizing communication overhead while maintaining efficiency is difficult

Synchronizing parallel tasks so that they do not interfere with each other is difficult

Approaches:

Divide and Conquer → Parallel computing divides complex problems into smaller, manageable tasks, processed concurrently
Heuristic Approaches → Develop parallel algorithms that use heuristic methods to find approximate solutions quickly

Sacrifice optimality for speed

Parallelization of Special Cases → Tailoring algorithms to exploit the structure of specific instances can enhance parallel performance

Quantum Computing → A type of computing that uses quantum mechanical phenomena (superposition and entanglement) to perform operations on data

Superposition → Means that a quantum bit can exist in a combination of two states at the same time
Entanglement → Means that two or more bits can share a quantum state and influence each other even when they are far apart
Superposition and entanglement allow quantum computers to perform parallel computations to solve complex problems faster

Quantum Gates → Basic building blocks of quantum circuits that perform operations on qubits (quantum bits) to manipulate their state

Interference → Occurs when two quantum states interfere with each other in such a way that their amplitudes add or subtract, depending on their relative phases and is used to cancel out unwanted outcomes and amplify desired outcomes

Session 26: YARN

MapReduce → A programming model or pattern within the Hadoop framework that is used to access big data in the Hadoop File System (HDFS)

Facilitates concurrent processing by splitting data and processing them in parallel on Hadoop commodity servers and aggregating it again
Processing is executed on the server where the data resides to expedite processing
Data access and storage is disk based
Process:

Map → Split data into smaller blocks and assign them to mappers for processing
Reduce → Map output values that have the same key are assigned to a single reducer
Combine → (Optional) A reducer that runs individually on each mapper server
Partition → Translates key value pairs from mappers to another set of key value pairs to feed into the reducer

Decided how the data has to be presented

YARN → Goal is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons

It implements a global Resource Manager (RM) per application Application Master (AM)

YARN Components:

Container → Holds physical resources like a disk on a single node, CPU cores, or RAM

Container Launch Context (CLC) → Contains records of dependencies, security tokens, environment variables

Application Master → Posts CLC by requesting the container from the node manager
Node Manager → Takes care of individual nodes in the Hadoop cluster and manages containers related to each node

It is registered to the Resource Manager and sends each node’s health status (stating if the node process has finished working with the resource)

Resource Manager → The master daemon of YARN and assigns resources
Scheduler → Responsible for allocating resources to the various applications subject to familiar constraints of capacities, queues, etc
Applications Manager → Responsible for accepting job-submissions, negotiating with the first container for executing the application specific ApplicationMaster and provides the service for restarting the ApplicationMaster container on failure

YARN Features: