Operating Systems

Operating system(OS)

system software that manages computer hardware and software resources, providing service/platform for programs. it acts as a middle man for the user and computer.

Two main types of OS are Windows and Unix based systems.

process - A running program. OS must provide ways the run, create, terminate and suspend processes.

System resources :

CPU
graphics unit
memory
network
I/O chip
Disk

Key Functions of an OS :

process, memory and file system management.
security and access control.
making hardware resources accessible to user
Virtualization - the act of giving each process the illusion that it has exclusive access to the PCs resources. virtualization should be efficient and use the least amount of resources possible.

Process creation - the act of loading and allocating programs onto memory and telling the CPU to run them. (unfinished)

Process lifecycle - this consists of 5 stages.

initial - the process has just spawned and is being loaded
running - the process is currently using the CPU to execute instructions.
ready - the process is ready in a queue and waiting to be used by the CPU. OS is responsible for switching processes between the running and ready stages.
blocked - the process is waiting for a I/O operation. processes may need to be block reasons like: user input, opening files, accessing internet. it gets unblocked when its I/O is complete and joins the ready queue.
final - the process is finished but not completely deleted yet. the parent process will run a zombie process(done to make sure the process ended successfully, checks for crashes) to finalize everything.

Latency - the time it takes between a request being sent and the first data received in response.

CPU virtualization

technique used by operating systems to create the illusion that each process has exclusive access to the CPU. It allows multiple processes to share a single physical CPU by rapidly switching between them, giving each process a slice of CPU time. CPU virtualization enables efficient multitasking and improves overall system performance by maximizing CPU utilization. it must also be secure. This creates the appearance of parallel execution, even on single-core processors. Key aspects of CPU virtualization include:
Time-sharing: The CPU's time is divided among multiple processes.
Process scheduling: The OS determines which process runs next and for how long.
Context switching: The state of one process is saved and another is restored when switching.
Isolation: Each process operates as if it has its own dedicated CPU.

Context switching - changing the running process(switches between different tasks like when you pause one task, remember where you stopped, and start another task). only the contents inside the CPU registers during context switch. if it initiated an I/O, its a voluntary CS, if OS forced the change its an involuntary CS. if a process is scheduled/de-scheduled, its a CS.

Direct execution - a technique used in operating systems where a process is executed directly on the CPU without any intermediate interpretation or emulation. This method allows for efficient and fast execution of processes. While direct execution provides high performance, it needs to be balanced with mechanisms for protection and control to ensure system stability and security. Key aspects of direct execution include:

Efficiency: Programs run at native speed on the hardware.
Simplicity: The OS loads the program into memory and starts execution at the entry point.
Limited control: The OS temporarily loses control while the process is running.
Challenges: Requires mechanisms to regain control and protect the system from malicious or erroneous programs.

Kernel mode - a mode where the CPU can access or modify memory locations that contain the OS code or data. it can also talk to storage or network devices. you cannot do this in user mode.

system call - any user process that calls a function related to the OS.

Trap CPU instruction - pauses the user code and saves the register state, the change hardware status to kernel mode and starts running the OS code. the reverse of this is call a return-from-trap instruction.

Limited Direct Execution - used in operating systems to balance the need for efficient program execution with the requirement for system control and protection. uses direct execution with mechanisms that allow the operating system to maintain control over the system like trap instructions. LDE allows the OS to maintain system integrity and security while still providing the performance benefits of direct execution for most program operations. Key aspects of LDE include:

Direct execution of user code: Programs run directly on the CPU for most operations, ensuring efficiency.
Controlled transitions: Use of special hardware instructions (like trap and return-from-trap) to switch between user and kernel modes safely.
Periodic interrupts: The OS uses timer interrupts to regain control periodically, even if a program is in an infinite loop.
Restricted operations: Certain operations (like I/O) require transitioning to kernel mode via system calls.

cooperative scheduling - a method of process scheduling where the operating system relies on processes to voluntarily yield control of the CPU. In this approach:

Processes are expected to periodically give up control of the CPU, typically when they are idle or waiting for I/O.
The OS doesn't forcibly interrupt running processes; instead, it waits for them to cooperate.
Processes use system calls like yield() to voluntarily release the CPU.
It's simpler to implement but can be problematic if a process doesn't cooperate (e.g., enters an infinite loop).
This method was used in early operating systems but is less common in modern systems due to its limitations in handling uncooperative or faulty processes.

Non-cooperative scheduling – a method of process scheduling where the operating system or scheduler does not rely on processes to voluntarily yield control of the CPU. In this approach:

The OS actively preempts running processes, forcibly interrupting them to ensure fair distribution of CPU time among all processes.
Processes do not voluntarily release the CPU; instead, the scheduler determines when to pause one process and start another.
It is more robust than cooperative scheduling because it prevents processes from monopolizing the CPU.
Non-cooperative scheduling avoids issues like processes entering infinite loops, as the OS ensures all processes get a chance to execute.
This method is commonly used in modern operating systems (e.g., Linux, Windows), which implement preemptive multitasking to handle uncooperative or faulty processes efficiently.

interrupt handler - interrupts whatever the CPU is running and allows OS to regain control

CPU scheduling

the process of managing the allocations of the CPU to various tasks. the OS uses scheduling algorithms to decide the orders and time of each process.

There are 3 metrics that are used to measure scheduling algorithms.

turnaround time - the difference between completion and arrival time. the closer to the CPU usage the better. the smaller the better
response time - the difference between the start and arrival time. the smaller the better
fairness

there are two kind of CPU processes

CPU intensive jobs - More CPU time needed.
I/O intensive or interactive jobs - Less CPU time needed.

Scheduling algorithms

First come first out - orders all processes in the order they arrive. optimal in terms of average waiting times.
Round robin - Picks the first job ready and runs for a fixed time slice. when time slice is finished, job is placed in back of queue if not completed and other jobs are done.
Shortest time to completion first - Keeps track of the CPU time needed for each job and schedules the job with the shortest amount. It is an optimal algorithm, as it reduces the TAT to the smallest.

Tradeoffs

policies like RR that evenly divide CPU among active processes on a small time scale will perform poorly on turnaround time.

policies like STCF that are less fair cost the overall response time of a CPU.

Multi level feedback queue

-A dynamic scheduling algorithm that organizes jobs into multiple queues, each with different priority levels, typically labeled from high to low (e.g., QnQ_nQn to Q0Q_0Q0). Each queue has a fixed time slice, which represents the maximum amount of CPU time each job is allowed before it must yield the CPU. MLFQ’s main goal is to adapt to different job behaviors, optimizing both responsiveness for interactive tasks and efficient handling for long-running, CPU-bound tasks.

Rules of MLFQ Scheduling

Highest Priority Scheduling:
- Always schedule the next job from the highest non-empty priority queue.
- This ensures that high-priority jobs are given CPU access first, improving responsiveness for time-sensitive tasks.
Round-Robin Scheduling within Queues:
- All jobs within the same queue are scheduled in a round-robin fashion.
- This means that each job in the same queue gets an equal time slice, cycling through jobs in the order they arrived, which balances fairness within priority levels.
New Jobs Start at Highest Priority:
- When a new job arrives, it is initially placed in the highest-priority queue.
- By starting new jobs at the highest priority, MLFQ gives prompt attention to newly arriving tasks, making it well-suited for interactive jobs that need quick responses.
Demotion After Using Full Time Slice:
- If a job uses up its entire time slice without yielding the CPU, it is moved down to the next lower-priority queue.
- This step identifies CPU-bound jobs that tend to consume more resources, gradually lowering their priority so that they don’t monopolize CPU time.
Periodic Priority Boost:
- After a defined period, all jobs are moved back to the highest-priority queue.
- This priority boost prevents starvation (where long-running, low-priority jobs are perpetually delayed) by periodically giving every job a chance to run at a high priority.

Purpose of Each Rule in MLFQ

Rules 1–3 ensure low response times for interactive jobs**:
- By always scheduling from the highest-priority non-empty queue, interactive jobs are handled quickly.
- Starting new jobs at the highest priority also enhances responsiveness, making the system ideal for handling interactive processes.
- Round-robin within each queue provides fairness among jobs of the same priority level.
Rule 4 helps identify and manage CPU-bound jobs**:
- CPU-intensive jobs that consume their full time slice are demoted, allowing the OS to gradually prioritize more interactive jobs over time, ensuring they don’t get stuck behind long-running tasks.
Rule 5 prevents starvation**:
- The periodic boost repositions all jobs back to the highest-priority queue, ensuring that even long-running or low-priority jobs get some CPU time eventually.

Example

Consider three jobs:

Job A (interactive, often yields the CPU early),
Job B (CPU-intensive, uses full time slices),
Job C (moderate CPU use, sometimes yields).
Job A frequently yields before its time slice ends, remaining in high-priority queues to maintain responsiveness.
Job B consistently uses full time slices and is demoted gradually, as it is CPU-bound.
Job C varies in behavior, staying in middle queues since it sometimes yields and sometimes uses full slices.

Over time, Job A stays highly responsive, Job B runs less frequently, and Job C adapts based on its behavior, achieving a balance between responsiveness and CPU efficiency. Periodic boosts also ensure that Job B eventually receives high-priority access, preventing starvation.

Memory virtualization

a core concept in operating systems that abstracts RAM and presents each process with its own virtual address space. This allows processes to run independently without directly interacting with the physical memory layout, which has several key benefits:
Isolation: Each process has its own memory space. This prevents bugs in one process from affecting others, making the system more secure and stable.
Flexible Memory Management: OS can allocate memory as needed and move data between RAM and disk, adapting to different needs easily.
Efficient RAM Use: using methods like paging, the OS allows processes to use more memory than physically available.
Address Translation: Memory Management Unit converts virtual addresses into actual locations in RAM. OS uses page tables to map this, so each process sees a seamless memory block even if it’s fragmented.
Demand Paging and Swapping: OS only loads memory pages as needed, moving inactive data to disk when RAM is full to make space for active processes
Supports Multi-tasking: Multiple processes can run concurrently without interfering with each other’s memory.
Memory Overcommitment: Virtual memory allows the system to commit more memory than is physically available, enabling larger applications and more concurrent processes.

Process address spaces - parts of RAM that are made to store data to be used by the CPU. here's an example image:

Stack - used for variables and method calls in python
heap - used for objects and instance variables in python

processes are usually written in high level languages. most high level languages don't allow user interaction with the OS for memory, but they can do it on the users behalf. languages that do allow it are harder to debug and make secure.

some processes like accessing elements in an array require contiguous memory from there address spaces.

memory fragmentation

free memory is divided into non contiguous blocks, making its harder to allocation contiguous segments of memory. can happen for many reasons like process termination, variable size allocation, dynamic memory allocation.

External fragmentation

Free memory is scattered throughout the system, forming small gaps between allocated memory.

The OS can combat this by understanding typical memory request patterns using memory allocation algorithms and memory functions.

Paged memory

Paging -Divides memory into fixed-size blocks: pages (virtual memory) and page frames (physical memory). Page Table maps virtual pages to physical frames. Efficiently handles memory allocation and avoids fragmentation.
contiguous address spaces done through paging(the process of dividing RAM into equal pages). simple and avoids fragmentation.
each process is divided into pages.
Page frames - fixed-size blocks of physical memory (RAM) that correspond to virtual memory pages. They are the physical counterpart to virtual pages in the paging system.

Address Translation -Process of converting virtual addresses (used by programs) to physical addresses (used by RAM). Ensures memory isolation and security between processes.
Page Tables- Data structures storing mappings between virtual pages and physical frames. Each Page Table Entry (PTE) holds:
- Physical frame number.
- Access permissions (e.g., read/write).
- Presence bit (whether the page is in physical memory or on disk).
Speed of Address Translation (AT):
- Translation Lookaside Buffer (TLB) improves speed by caching recent address translations. TLB reduces the time spent looking up the page table, speeding up translation.
  - TLB hit: Address translation found in TLB, fast access.
  - TLB miss: Page table lookup needed, slower access.
Multi-level Paging:
- Breaks down large address spaces (e.g., 64-bit) into multiple levels of page tables.
- Reduces memory overhead but increases the complexity of translation
Locality of Access:
- Spatial locality: Pages near recently accessed addresses are likely to be accessed soon.
- Temporal locality: Recently accessed pages are likely to be accessed again.
- OS uses these principles to optimize paging and cache management (e.g., TLB effectiveness).
Page Faults:
- Occur when a process tries to access a page not currently in physical memory.
- The OS loads the page from disk and updates the page table.
Swap Space (Disk Storage):
- When physical memory (RAM) is full, the OS uses swap space on the hard disk to temporarily store inactive pages.
- Swapping pages in and out of RAM allows processes to continue running despite limited physical memory.
- Swap space is slower than RAM, so excessive paging (thrashing) can reduce system performance.

File systems

responsible for how the OS interacts with storage devices, particularly secondary storage. It provides users with an interface for managing and accessing files.

Devices and OS Interaction

hardware interface: The OS hides the complexity of storage devices (such as their internal structure) by interacting with them via hardware registers and data areas.
Interrupt-based interaction: Most interactions with devices are interrupt-based to prevent CPU waste. The process works like this:
- The OS blocks the process, requesting I/O (e.g., disk access).
- It then checks if the device is free and initiates the I/O operation.
- The device interrupts the OS once the I/O operation is complete, allowing the blocked process to resume.
DMA (Direct Memory Access): Instead of involving the CPU, the OS instructs the I/O controller to handle data transfers between memory and the device directly, freeing up the CPU for other processes.

Types of Secondary Storage

HDDs (Hard Disk Drives)

Mechanical storage devices with spinning disks and read/write heads
Generally cheaper and offer higher storage capacities
Slower access times due to mechanical nature
Better suited for large file storage and archiving

SSDs (Solid State Drives)

Use flash memory for data storage, with no moving parts
Significantly faster than HDDs in terms of read and write speeds
Lower latency, especially when reading small blocks of data
Higher throughput (data transfer rate) compared to HDDs
More energy-efficient and durable due to lack of moving parts

Both HDDs and SSDs access data in blocks, with the block size varying based on the specific device and file system.

Users interact with secondary storage via files: a named collection of related data stored persistently on the storage device, even after the system powers off.
File systems allow users to:
- Create, delete, read, write, and manipulate files.
- Use symbolic names for files (e.g., filenames and directories).
- Share files while protecting unauthorized access through access control mechanisms.
- Automatically manage the physical storage (e.g., allocating files to blocks).

File System Components

File attributes: Each file has attributes like name, size, and permissions.
Directories: containers for files and other directories, creating a hierarchical structure for data storage. they have tree-like structure, store Metadata and have access control. Files can be grouped into directories for logical organization.
- A directory file contains the file descriptors for each file within it.
- Files are named by paths, either absolute or relative.
Root directory: This is the top-level directory, with sub-directories for each user.

File System Operations

The OS offers various services for interacting with files and directories:
- Basic file operations: Creating, deleting, copying, renaming, and listing files or directories.
- System calls for programmers: Opening, closing, reading, writing, and seeking within files.
- Interactive facilities: For users to manage files via a console or a graphical interface.
Program-Level Services:
- Open/Close: Prepares files for read/write.
- Read/Write: Data transfer to/from a file.
- Seek: Moves the cursor to a specific position in the file.
- Walk: Traverses directories.

Disk and File System Organization

Disk Blocks: Disks are divided into equal-sized blocks (4KB–512KB).
Master Boot Record (MBR) - contains system boot code and partition info.
Partitions: Each partition has a boot block and a superblock (contains File System metadata).

Building a File System

Structure Example: A 256KB HDD partition with 64 blocks (4KB each).
- Metadata: First 8 blocks used for FS metadata.
- User Data: Remaining blocks hold file data.
- Inodes: Special data structure for each file, usually stored in blocks 3-7.

Inodes

Definition: Metadata for each file, including pointers to data blocks.
Structure: Typically, each file has one inode.
Pointer Limitations: Fixed-size inodes limit max file size.

Free Space Management

Bitmaps: Tracks used/free blocks (1 = used, 0 = free).
Example: Inode bitmap (block 1) and data block bitmap (block 2).

File Allocation Techniques

Multilevel Pointers (e.g., Unix ext2):
- Direct Pointers: Point directly to data blocks (efficient for small files).
- Single Indirect: Points to a block containing pointers to data blocks.
- Double/Triple Indirect: Extends max file size by adding levels of pointers.

Implementing Directories

Directory Entries: Contain mappings from filenames to inodes.
Inode-Based System: Directories are essentially files with inode numbers pointing to file data.

Path Resolution

Example: Path “/usr/ast/mbox”.
- Locate root directory inode (usually inode 2).
- Traverse each path component to locate the final file.