Processes I - Lecture Notes

Analogical Introduction to Processes: The Cookie Recipe

  • The Recipe Analogy:

    • The Recipe Book: Equivalent to the Program installed on a computer (e.g., Adobe Photoshop). It is a passive set of instructions.

    • Baking: Equivalent to a Process. It is the active following of instructions.

    • Materials (Flour, Butter, Eggs): Equivalent to the Information and Data (Resources) required by a process.

    • The Oven: Equivalent to the CPU. A CPU, like an oven, is used to run the process.

    • Resource Constraints: Just as you must wait for the first tray of cookies to finish in the oven before starting a second tray, you cannot generally use the CPU for a new process while it is actively occupied by another.

Defining the Process

  • Passive vs. Active:

    • Program: A passive set of instructions stored on a secondary storage device, such as a disk.

    • Process: An active execution of a program stored in memory.

    • Transition: A program becomes a process the moment it is loaded into memory.

  • Characteristics of execution:

    • Concurrency: Processes can create sub-processes to execute at the same time.

    • Sequential Fashion: Execution of a process must progress one instruction after another.

    • CPU Interaction: The CPU executes instructions sequentially until the process completes or another specific event occurs.

  • The Precise Definition:

    • A process is the maintained context (information and data) for an executing program.

    • It serves as an abstraction of a physical processor.

    • It exists to allow the Operating System (OS) to coordinate multiple concurrent activities, such as multiple users or incoming network data.

    • Required Resources: To accomplish tasks, a process needs CPU time, memory, files, and I/O devices.

Process Management and Identification

  • OS Responsibilities:

    • Creation and deletion of processes.

    • Suspension and resumption.

    • Provision of mechanisms for process synchronization and communication.

    • Handling of deadlocks.

  • Process Identifiers (PID):

    • The OS distinguishes among different processes by assigning each a unique, non-negative integral process ID (PID).

    • Hierarchy: Every process has a Parent Process that started it. A process can spawn Child Processes.

    • Kernel Usage: The kernel uses the PID to manage and control the process throughout its lifetime.

  • Well-Known Process IDs:

    • PID 0 (swapper / sched): A system process part of the kernel responsible for memory management.

    • PID 1 (init): A continually-running daemon process responsible for booting up and shutting down the system. It is invoked by the kernel at the end of the bootstrap procedure.

    • Note: In many modern Linux distributions, init has been replaced by systemd.

  • Management Commands:

    • pstree: Prints a tree representation of processes showing the hierarchy.

    • ps: Shows processes run by the current user, specifically displaying PID, TTY (Terminal), TIME (aggregated execution time), and CMD (the command run).

    • renice: Used by administrators to change the priority (nice value) of a running process (e.g., renice --5 501).

    • kill: Ends a task or process using its PID (e.g., kill 501).

Process Context: User-Level and Register Context

  • User-Level Context: When a program is loaded, it is organized into four memory segments:

    • Text: Contains the executable instructions or actual program code.

    • Data: Contains global and static variables initialized at runtime.

    • Heap: Contains dynamic memory allocated during runtime (e.g., p = new char[1000]).

    • Stack: Contains return addresses, function parameters, and local variables.

  • Register Context: To allow the OS to switch between programs (swapping), the CPU registers must be saved so a process can be resumed without losing its state. This includes:

    • Program Counter (PC): The address of the next instruction to be executed.

    • Processor Status Register: Hardware status at the time of preemption.

    • Stack Pointer (SP): Points to the top of the kernel or user stack.

    • General-Purpose Registers: Hardware-dependent registers (R0, R1, R2, etc.).

System-Level Context and the Process Control Block (PCB)

  • System-Level Context: Includes OS resources such as open files and signal-related data structures.

  • Process Control Block (PCB): Also called a Task Control Block, this data structure stores the context information for the OS to manage the process. The OS maintains a PCB Table with one entry per process via the PID.

    • Contains the System-level Context, and the Register Context but doesn’t contain the User Level context.

  • Data Stored in the PCB:

    • Process State (e.g., Running, Waiting).

    • Program Counter.

    • CPU Registers.

    • CPU Scheduling Information (Priorities, queue pointers).

    • Memory-management information (Allocated memory ranges).

    • Accounting Information (CPU time used, clock time elapsed, time limits).

    • I/O status information (Allocated I/O devices, list of open files).

Process State Machine Models

  • General 5-State Model:

    1. Created (New): The initial state when a process is first made.

    2. Waiting (Ready): Awaiting scheduling for execution.

    3. Running: Actively executing instructions on the CPU.

    4. Blocked: Unable to continue until a specific event (like I/O) occurs.

    5. Terminated: Execution has ended or the process was killed.

  • Detailed 9-State Model:

    1. Executing in user mode.

    2. Executing in kernel mode.

    3. Ready to run (not yet scheduled).

    4. Sleeping and residing in main memory.

    5. Ready to run, but currently swapped to secondary storage; must be brought back by process 0 (swapper) to execute.

    6. Sleeping and swapped to secondary storage.

    7. Returning from kernel to user mode, but preempted by the kernel for a context switch.

    8. Newly created and in transition; exists but is not ready or sleeping. This is the start state for all processes except PID 0.

    9. Zombie State: The process has executed an exit system call. It no longer exists but leaves a record of its exit code and statistics for its parent to collect. This is the final state.

Process Scheduling and Queues

  • Short-term Scheduler (CPU Scheduler):

    • Selects which process from the ready queue is executed next and allocates the CPU.

    • Invoked very frequently (on the scale of milliseconds).

  • Long-term Scheduler (Job Scheduler):

    • Selects which processes are brought into the ready queue from the job pool.

    • Invoked infrequently (seconds or minutes).

    • Controls the degree of multiprogramming (max number of efficient processes).

  • Process Characteristics:

    • I/O-bound: Spends more time on I/O than computation; characterized by many short CPU bursts.

    • CPU-bound: Spends more time on computations; characterized by a few very long CPU bursts.

  • Scheduling Queues:

    • Job Queue: Set of all PCBs in the system.

    • Ready Queue: Set of all processes in main memory ready to execute.

    • Device Queues (I/O Queues): Set of processes waiting for specific I/O devices.

The Context Switch

  • Definition: The procedure of saving the state (PCB) of an old process and loading the saved state (PCB) for a new process when the CPU switches tasks. Each process will have its own PCB.

  • Overhead: Context switching is considered overhead because the system does no "useful" work during the switch. A typical switch time is approximately 1msec1\,\text{msec}.

  • Hardware Support: Some hardware, like the Sun UltraSPARC, provides multiple sets of registers per CPU. This allows the context switch to occur almost instantly by simply changing a pointer to the current register set.

Process Execution: The exec() System Call Family

  • Functionality: Replaces the current process's memory space (text, data, heap, and stack) with a brand new program from the disk.

  • Persistence: The Process ID (PID) does not change across an exec() call. Open file descriptors also remain open unless explicitly marked as close-on-exec.

  • The Six Family Members:

    • Returns 1-1 on error; does not return on success.

    • int execl(const char *path, const char *arg, ...): Takes full path and variable length arguments terminated by NULL.

    • int execlp(const char *file, const char *arg, ...): Searches for the file in the $PATH environment variable; full path not required.

    • int execle(const char *path, const char *arg, ..., char *const envp[]): Similar to execl but allows passing environment variables.

    • int execv(const char *path, char *const argv[]): Arguments are passed as a NULL terminated array of strings.

    • int execvp(const char *file, char *const argv[]): Similar to execv but searches the system $PATH.

    • int execvpe(const char *filename, char *const argv [], char *const envp[]): Array-based arguments with environment variables.

  • Example Usage:

    • execlp("ls", "ls", "-r", "-t", "-l", NULL); is equivalent to running ls -r -t -l in the shell.

Q&A and Practical Applications

  • Q: What happens if a parent process (like bash) is terminated?

    • A: Its child processes (like firefox or terminal) may be terminated or adopted by another system process, such as init (PID 1).

  • Q: Why is register context critical during a switch?

    • A: It allows the process to resume exactly where it left off without data corruption or loss of progress.

  • Q: What is a Zombie process?

    • A: A process that has finished execution but still has an entry in the process table to allow the parent process to read its exit status.

  • Q: Why is a context switch considered overhead?

    • A: Because the CPU is busy managing the OS transition rather than executing user instructions.