Performance Fundamentals in Computer Architecture
- Performance is a broad term in the computer world, dependent on the specific context.
- Gaming, networking, and processor-specific tasks each have their own performance requirements.
- The perceived performance of a computer by a user is related to how quickly it responds to commands.
- Traditionally, a computer's performance was often limited by its slowest component (e.g., hard disk).
- Modern performance evaluation involves benchmark tests for individual components (GPU, CPU, motherboard) and overall system performance.
- Benchmark software helps users compare hardware before purchasing.
- This lecture focuses specifically on CPU time, which is the time the CPU spends executing instructions.
- Response time refers to the total time it takes for a system to respond to a request.
- It includes queuing delays, transmission through layers, execution time, etc.
- This lecture's focus is limited to the CPU's execution time of machine code.
- Latency is related to delay but can vary based on different parameters, such as processing load.
- Throughput is another performance metric, representing the amount of work done in a unit of time.
- Increasing throughput often requires adding more resources (e.g., computers).
- Improving individual computer performance can also increase throughput.
- Throughput is not the primary focus of this lecture.
- CPU time can be divided into user CPU time (for application programs) and system CPU time (for the operating system).
- This lecture will focus on user CPU time.
- Modern benchmark programs can provide numerous performance metrics at once.
- Improving program performance, in the context of CPU time, means reducing the execution time.
- If computer A completes a task in 20 seconds and computer B takes 25 seconds, computer A is more performant.
- 2520=0.8, inverting gives 1.25, meaning A is 1.25 times faster than B.
- When analyzing CPU performance, several key factors are considered:
- How quickly a command is executed, which involves fetching, decoding, and executing.
- The role of the clock signal in synchronizing operations within the CPU.
- The CPU's clock frequency, often measured in gigahertz (GHz).
- 1 GHz=1×109 cycles per second
- A CPU with a clock frequency of 1 GHz has a clock cycle time (period) of 1 nanosecond.
Instruction Execution and CPI
- Not every instruction executes in a single clock cycle.
- The number of cycles per instruction (CPI) varies depending on the instruction type and processor architecture.
- CPI represents the average number of clock cycles required to execute an instruction.
- To calculate the execution time, you need to know:
- The number of instructions.
- The average CPI.
- The clock cycle time (period).
- Execution Time=Number of Instructions×CPI×Clock Cycle Time
- The architecture of the processor and the instruction set architecture (ISA) impact performance.
- Different ISAs have different instruction types, which may take varying numbers of cycles to execute.
- Compilers can optimize code, potentially reducing the number of instructions needed.
- Three main parameters affect performance:
- Number of instructions.
- CPI.
- Clock frequency.
- To compare the performance of two computers (A and B), you can measure their CPU times.
- Performance Ratio=CPU Time 2CPU Time 1
- A smaller CPU time indicates better performance.
- Key factors influencing CPU time include CPI, instruction count, and clock frequency.
- Modern processors can dynamically adjust their clock frequencies, especially in mobile devices.
Additional Considerations
- Increasing clock frequency doesn't always guarantee increased performance.
- A processor's performance also depends on how well it serves multiple programs and users.
- Other factors, such as the materials used in the processor and its internal features, can influence performance.
- When selecting hardware, consider benchmark results and performance tests, particularly in graphics-intensive applications.
- Pay attention to both CPU and GPU performance, as well as the overall system performance.
- Modern benchmark programs offer comprehensive performance reports.
Examples and Practical Advice
- The lecture references a video with detailed examples of performance calculations.
- It mentions older SPEC benchmark results but notes that they are outdated.
- It advises students not to focus too much on the older performance criteria.
- It recommends researching modern CPU benchmark software when purchasing or upgrading hardware.
- The lecture highlights the adaptive nature of modern processors, which can adjust their frequencies based on power and performance needs.