1
A. Parallel Processing System
Parallel processing
refers to one or more independent operating systems managing multiple processors and performing multiple tasks.
Parallel processing
is very fast and can share the memory unit.
Flynn’s taxonomy and classification by memory structure are leading
examples of parallel processing system classification
2
A. Parallel Processing System
2. Flynn’s classification of parallel processing systems
2.1 Single instruction stream - single data stream,
Single Instruction Stream Single Data Stream (SISD)
is a single processor system that sequentially processes an instruction and data, one at a time.
2.1 Single instruction stream - single data stream, Single Instruction Stream Single Data Stream (SISD)
It is the conventional computer architecture that follows von Neumann’s concept.
2.2 Single instruction stream - multiple data stream,
Single Instruction Stream Multiple Data Stream (SIMD)
The structure of processing multiple data with an instruction to simultaneously perform the same operation on multiple data.
2.2 Single instruction stream - multiple data stream,
Single Instruction Stream Multiple Data Stream (SIMD)
It is also called an array processor, as it enables synchronous parallel processing.
2.3 Multiple instruction streams -single data stream, Multiple Instruction Stream Single Data Stream (MISD)
-Each processing unit in the ____ parallel computing architecture runs different instructions and processes the same data. The pipeline architecture is an example. It is not a widely used architecture.
2.4 Multiple instruction streams - multiple data stream,
Multiple Instruction Stream Multiple Data Stream (MIMD)
-In a ____ structure, multiple processors process different programs and different data, and most parallel computers fall into this category. It can be classified into a shared memory model and a distributed memory model, depending on how it uses the memory
3.1 Symmetric multiprocessor (SMP)
is a tightly-coupled system in which al processors use the main memory as the shared memory. It is easy to program since the data transfer can use shared memory
3.2 Massive parallel processor (MPP)
is a distributed memory type in which each processor has an independent memory. The loosely coupled system exchanges data between processors through a network, such as Ethernet.
3.3 Non uniform memory access (NUMA)
is a structure the combines the advantages of the SMP which is a shared memory structure that makes it easier to develop programs and the MPP structure, which offers excellent scalability.
4. Types of parallel processor technology
4.1 Instruction pipelining
The technology improves the CPU performance by dividing an operation into several stages and configuring a hardware unit for processing each stage separately in order to process different instructions simultaneously-
instruction fetching (IF), instruction decoding (ID), operand fetching (OF), and execution (EX).
The stages of the four-stage instruction pipeline are
pipeline hazard
refers to the pipeline speed exceptionally slowing down. _____ include the data hazard, the control hazard, and the structural hazard.
Data hazards
occur when the next instruction execution has to be delayed until the previous instruction has been completed because of the dependency between instruction operands.
Control hazards
are generated by branch instructions, like branch and jump which change the execution order of the instructions.
Structural hazards
are generated when instructions cannot be processed in parallel in the same clock cycle, due to hardware limitations.
5. Parallel programming technology
5.1 Compiler technology - OpenMP
is a compiler directive-based parallel programming API.
The execution model of OpenMP
is the fork/join model.
5. Parallel programming technology
5.2 Message passing parallel programming model, MPI
is a parallel programming model suitable to a distributed memory system structure.
Parallel programming
tools for message passing include High Performance FORTRAN (HPF), Parallel Virtual Machine (PVM), and Message Passing Interface (MPI). MPI has become the standard
5.3 Load balancing technologies -AMP, SMP, and BMP
adequately distributes jobs to the cores in order to increase the multi core performance.
AMP , SMP , BMP model
_____ An OS is executed independently in each processor core.
______ An OS manages al processor cores simultaneously. Application programs can operate in any core.
_______ An OS manages al process cores simultaneously, and an application program can run on a specific core
6. Graphic processing technology
6.1 Graphics processing unit (GPU)
The hardware specializes in computer graphics calculation and is mainly used for the rendering of 3D graphics.
GPU
dedicated to processing large-capacity image data generates results through parallel jobs using multiple cores.
6.2 General-purpose GPU (GPGPU)
a GPU shows high computational performance in matrix and vector operations that are mostly used for graphic rendering, the computing system intends to utilize GPUs in the general computing domain as well.
Many models supporting GPGPU programming have appeared.
CUDA
____ is a parallel computing platform and a programming model that can significantly improve computing speed with a large number of GPU cores.
CUDA
It provides intuitive GPI programming, based on the C language, and it enables quick operation using shared memory.
CUDA
is expected to show an excellent performance improvement when applied to performing tasks suitable for parallel processing operations in various fields that require a large amount of computation, such as simulation.
7. GPU-based parallel programming technology
Open Computing Language (OpenCL)
maintained and managed by Khronos Group, is an open, general-purpose parallel computing framework developed by Apple, AMD, IBM, Intel, and NVIDIA.
Open Computing Language (OpenCL)
It is an industry standard programming model for heterogeneous computer systems, consisting of GPUs, CPUs, and other processors.
C++ Accelerated Massive Parallelism (C++ AMP)
was developed by Microsoft in an open programming language for heterogeneous computing, using CPU and GPU, C++ AMP, when added to Visual Studio 2021, can increase the execution speed of C++ codes using GPU.
C++ Accelerated Massive Parallelism (C++ AMP)
intends to help developers create general-purpose programs using GPU without a high level of understanding or application capability about DirectX API.
OpenACC
NVIDIA introduced ____, a programming model based on compiler directives that abstract CUDA, _____ is a programming model for higher productivity, since it provides a relatively simple programming environment for developers.
Direct attached storage (DAS)
The storage connects a computer system with disks directly through a fiber channel or SCSI cable in order to utilize the storage capacity. It allows the computer system to manage the file system directory.
Network attached storage (NAS)
The storage has a separate file system management server (controller) to manage the storage media such as HDD and SSD.
Storage area network (SAN)
was developed to overcome the disadvantages. It uses a dedicated fiber channel switch for fast connection, and it enabled the ability to scale up the number of connected servers and storage, with less impact to the connected network load.
3. IP-SAN
This type of SAN uses the gigabit Ethernet Internet protocol (IP), instead of a fiber channel.
Fiber Channel over IP (FCIP)
is used to connect a remote SAN. It encapsulates data to TCP/IP for interconnection when transferring a frame to a remote location.
Internet fiber channel protocol (iFCP)
provides a TCP/IP connection dedicated to regional SAN, using the iFCP gateway.
Internet SCSI (iSCSI)
encapsulates SCSI commands into IP packets and transfer the I/O block data through TCP/IP. Technologies like \n IPSec ensure reliability.
4. Storage capacity management
Thin provisioning
The existing fixed-allocation storage technology uses a thick logical unit number (LUN) wasted data storage space.
4. Storage capacity management
Data de-duplication
provides a high efficiency of disk space used by removing any duplicated data when saving the data.
5. Storage disk scheduling
Disk scheduling
disk drive that stores data is a device using a rotating magnetic disk.
5. Storage disk scheduling
Disk scheduling
is a technique of efficiently processing I/O requests, when multiple users request them, in order to process different tasks.
-Maximization of I/O requests to service during a unit time
-Maximization throughput per unit time
-Minimization of the mean response time
-Minimization of response time
-Minimization of the variation of response time
Using disk scheduling has the following purposes:
5. Storage disk scheduling
Disk performance measurement indicator
can be compared with the indicators that measure disk performance.
Disk performance measurement indicators
include the access time, seeking time, rotational delay or rotational latency, and data transfer time
seeking time
indicates how long it takes to move the head from the current head position, to the track containing the data.
rotational latency
indicates how long it takes from the moment the head begins rotating to move to the track containing the data, to the moment it reaches the sector that contains the data.
data transfer time
indicates how long it takes to transfer the read data to the main memory. This section describes techniques to minimize the access time by minimizing the seeking time and the rotational latency.
First come first serve (FCFS) disk scheduling
services the requests in the order they are received. The head position moves in the order of the requested tracks in the disk standby queue.
Shortest seeking time first (SSRF) disk scheduling
The scheduling technique first services the request that is closest to the current head position, among the requested services waiting in the queue.
SCAN disk scheduling
The scheduling technique first services the request that has the shortest seeking distance from the current direction of the moving head.
LOOK disk scheduling
The technique is the same as the SCAN disk scheduling, except that the head changes its direction before reaching the outermost or innermost cylinder.
5. Storage disk scheduling
Circular SCAN (C-SCAN) disk scheduling
The SCAN technique moves the head by connecting the inner and outer tracks in a circular model.
Circular LOOK (C-LOOK) disk scheduling
it is a LOOK scheduling technique that connects the inner and outer tracks in an annular model in order to make the head move.
3
C. High Availability Storage
1. Redundant array of independent disks (RAID) technology
Large-capacity storage systems
generally have an error controller and backup function to safely store the massive volume of data.
RAID
is a storage technology that minimizes the factors that can cause failure, and it improves access performance by arranging a number of disks, and by creating a separate disk unit by linking them with each other
are improved availability, increased capacity, and increased speed.
main features of RAID
4
C. High Availability Storage
1. Redundant array of independent disks (RAID) technology
RAID-0 (Striped disk array without fault tolerance)
consists of two or more drives and uses disk striping, which stores data by dividing it into pieces of a specific size and saves it on multiple disks at once.
RAID-1 (Mirroring and Duplexing)
uses a mirroring technique that redundantly stores data on two drives. Since data is stored in redundancy, data can be restored, even if a drive fails.
RAID-4
has a separate parity drive and collects and stores parities for data verification and recovery.
RAID-5
is an improvement of RAID-4 by distributing the load of the drive that stores the parities.
RAID-6 (Stripe set with dual distributed parity)
is similar to RAID-5, except that while RAID-5 stores one parity, RAID-6 redundantly stores a parity in two drives. The configuration is more durable than RAID-5 and can store data safely.
RAID-10(Striping & Mirroring)
requires at least four drives and is a combination of RAID-0 and RAID-1 to improve I/O speed while providing data stability.
Linear tape-open (LTO)
is a standard open tape drive technology that supports high-speed data processing and a large capacity.
Virtual tape library (VTL)
is a backup solution that emulates disk storage and makes it into a virtual tape device to compensate for problems such as limited performance, scalability, and the recovery time
5
D. Graphic Compression Technology
Graphic compression type
1. Graphic compression type
Video data compression, which accounts for most of the traffic in a multimedia network, can be divided into lossless compression (reversible compression) and lossy compression (irreversible compression).
Lossless compression
is also called reversible compression.
Graphic compression type
refers to a method of restoring a compressed image without information loss from the original data while decompressing.
Lossy compression
is also called irreversible compression.
Graphic compression type
refers to a compression method when the compressed data is restored, but it does not match the original data before the compression because some data is lost.
Lossless compression
Since the compression and decompression algorithms are exactly the opposite of each other, the compression method preserves the original data's integrity, and no part of the data is lost during processing.
Lossy compression
compromises some accuracy to increase the compression rate, by allowing the loss of redundant or unnecessary data. There are two types of lossy compression methods: prediction coding and transform coding.
The prediction coding method
is used for digitizing the analog signal, Instead of separately quantizing the PCM (Pulse Code Modulation) samples, it quantizes the difference.
The transform coding method
transforms a signal from one domain (mainly a time and space domain) to another domain (mainly a frequency domain), then compresses it.
Multimedia data
includes text, image, video, and audio data. The text has the form of plain text and non-linear hypertext.
Multimedia data, Unicode
The basic language is _____ for expressing symbols, and it uses a loss less compression method.
Multimedia data
an image is called a still image and refers to a photo, fax page, or a video frame
Multimedia data
In the transformation process, the JPEG uses DCT (Discrete Cosine Transform) in the first stage of compression, and the decompression uses the inverse DCT method.
Multimedia data
The transformation and inverse transformation apply 8 X 8 blocks.
The quantization process creates integers from the real number of the DCT transform output and converts some values to zero.
Multimedia data
The coding process arranges data in a zigzag order after quantization and before encoder input, then lossless compression is performed using run-length decoding and arithmetic coding
Video compression standard
The Moving Picture Experts Group (MPEG) is an international standardization organization. The official name of the standard is ISO/IEC JTC1/SC29/WG11.
MPEG
created the following compression formats and additional standards.