OC

Lecture 21 - HPC

  • what is a computer: an electronic device that stores and processes digital info and follows programmed logic and instruction set

    • what are the 5 components of a computer system: CPU, memory, control unit, I/O units

    • What is the CPU (Central processing unit): circuity that carries out instructions of computer program by using arithmetic, logical, and controlling I/O operations

      • What is the speed of the CPU controlled by: the system clock

        • what does the system clock do: generate electronic pulses at regular intervals to coordinate CPU activities making sure even the slowest operation can finish

      • how is the performance of a CPU measured: in clock speed (in GHz) and FLOPS (floating point operations per second)

        • what does FLOPS tell us: how fast a single computation can be done by the CPU

        • what does clock speed tell us: how many instructions performed per second by the CPU

          • note: FLOPS have increased with time, but clock speed has saturated and leveled out over time

        • What is the thermal brick wall: clock-rate reached upper limit because we need more cooling power

          • why do we need more cooling power: higher CPU speed = higher clock rate = faster electric current = higher current = more heat = lower signal-noise ratio

          • what is the current thermal brick wall at: hard to get > 4.0 GHz

    • What are memory modules: any physical device capable of storing information for immediate use

      • note: memory is not directly controlled by the CPU

      • note: it requires persistent power to operate

  • What is parallel computing: computation where many calculations are carried out simultaneously by breaking big problem into smaller ones and solving them concurrently

    • what is computational gain: serial time / parallel time

    • what is parallel efficiency: computational gain / number of processors

    • what is serial computing: a single processor running the computer program

    • what is shared memory parallelism (OpenMP): multiple processors or threads working on different parts of the program, but share memory, sometimes competing for resources which slows down the process

    • what is distributed parallelism (Message Passing Interface): multiple processors working separately without having to contend with resources

      • what's the issue with distributed parallelism: communication between processes is much more difficult

  • What is a supercomputer: computer cluster made of nodes (connected computers) that work together as a single system

    • describe the schematic of Midway cluster: workstation (pc) connects to internet which talks to Midway login nodes which connects to Midway computer nodes (which does not directly connect to internet)

  • What is an OS: operating system is software closest to the computer hardware that manages all hardware and software, abstracting hardware from user programs

  • What is SSH: secure shell is a cryptographic network protocol for operating network services over an unsecured network, it uses encryption to secure connect b/t client and server (used for connecting to remote super computer)

  • What are the statistics of the Midway3 compute nodes: 192gb of memory, 100Gbps network, 24 cores, 3 GHz base frequency

  • What is the storage of Midway3: 2.2 PB

  • What is the shell: text-based terminal that takes in keyboard input and outputs to text

  • What is SLURM: workload manager that schedules jobs and manages resources between multiple users

  • What are the 3 V's of big data

    • Volume of data, users, and connected devices

    • Velocity of data transfers, new users, and new devices

    • Variety of types of data and whether data is structured or unstructured

    • Veracity or the accuracy of the data

      • What is structured data: data that is formatted to be easily used with other databases

        • what are examples of structured data: databases, JSON, HTML CSV, etc.

      • What is unstructured data and what are examples: data that is not structured ie. web pages, documents, pdfs, emails, media, sensor data

        • what are the challenges of unstructured data: they must be cleaned, outliers removes, pre-processes, edited, scraped, integrated, and prepared and analyzed

  • What are the 4 rules of data trust:

    • not all data is trustworthy

    • not all trustworthy data is correct

    • untrustworthy data is not always incorrect

    • even if data is correct, answer may be wrong

  • Why is data visualization necessary: it can help create meaningful interpretations of results