CS 492 - File Systems

0.0(0)

Studied by 14 people

0.0(0)

Call with Kai

Knowt Play

New

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/73

Earn XP

Description and Tags

encompasses file systems, input output, and disks

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

74 Terms

New cards

Requirements of File Systems

Store a very large amount of information (at a reasonable cost)
Information must survive the termination of the process using it
Multiple processes must be able to access information concurrently

New cards

4 types of different storage devices

Mechanical rotating hard disk drive (HDD)
Solid state drive (SSD)
Tape (uses sequential access, reading a byte requires all previous bytes to be read first)
Optical disk (obsolete today)

New cards

File naming conventions

File extensions (.pdf, .exe, etc.)
File name length
- FAT: 8.3 characters (8 for name, 3 for extension), later extended to 255
- EXT4: 255 characters
Some special characters not allowed in file names
- FAT: No “*/:<>?\| and more
- EXT4: No ‘\0’ and ‘/’ or the special names “.“ and “..”
Case sensitivity
- FAT: lowercase and uppercase are treated the same
- EXT4: lowercase and uppercase are different

New cards

File extensions

In Windows they correspond to a file type, so it determines how the file should be open
In Unix they are just conventions, sometimes it determines how a file should be opened but its mostly for the user; the kernel does not care

<ul><li><p>In Windows they correspond to a file type, so it determines how the file should be open</p></li><li><p>In Unix they are just conventions, sometimes it determines how a file should be opened but its mostly for the user; the kernel does not care</p></li></ul><p></p>

New cards

Different file types

Regular files
Directories
Soft links
Character special files
- Modeling serial I/O
Block special files
- Modelling the disk drive

New cards

File operations

Create/Delete
Open/Close
Read/Write
Append (write to end of file)
Seek (move position within file)
Get/Set attributes
Rename (needs to be atomic)

New cards

Directories

Inherently are files themselves, data structures that organize and maintain information about files
File attributes are stored in-band (FAT) or out-of-band (EXT4)

New cards

Path names

Hierarchical directory
- Uses a name and separator (\, /, >, …)
- Can also be done relative to the current (working) directory
- “.” represents the current directory, good for relative paths
- “..” represents the parent directory
Ex:
- Windows: \usr\ast
- UNIX: /usr/ast
- MULTICS: >usr>ast

New cards

Directory operations

Create/Delete
- . and .. subdirectories are also created/deleted
Opendir/Closedir
Readdir (reading entries one by one)
Rename
Link/Unlink
- Hard link
  - New name for existing file
- Soft link
  - Windows shortcut equivalent

New cards

File system

The file system data (also including metadata) stored in storage (disks)
The kernel’s file system subsystem is organized into layers
- I/O control
  - Software/drivers needed to communicate with devices
- Basic file system
  - Read/write and organize/manage the basic blocks of data
- File-organization module
  - Takes the blocks and turns them into files and directories
- Logical file system/Virtual file system
  - Takes the different file systems from the different devices and makes them look the same to the user

New cards

Disk and file system layout

First block on a disk is the Master Boot Record (MBR)
Disks can be divided up into one or more partitions, each with an independent file system
On boot, the computer’s BIOS runs first
- The BIOS reads a list of storage devices to find a device it can boot from (checking if device has MBR and loads MBR into memory if it does)
- The MBR starts executing, reading the partition table to find the bootable partition and loads the PBR/Boot block of that partition, copying it into memory
- PBR will find the boot loader within the files and directories, loading it into memory
- Boot loader prompts the user for which version of the OS to boot, once selected it will find the kernel and copy it into memory to start executing
Nowadays BIOS + MBR replaced by UEFI + GPT

New cards

Master Boot Record (MBR)

Contains partition table (size and location of each partition; one partition marked as active / bootable) + boot code that finds and loads PBR

New cards

Boot block/Partition Boot Record (PBR)

Boot code needed to find the kernel loader inside the file system of that partition, which itself then finds the kernel

New cards

Superblock

stores key parameters about the file system (file system type, number of blocks, etc.)

New cards

Computer boot sequence order

BIOS
MBR
PBR
Boot loader
Kernel
User

New cards

Addressing

Units: a disk block is the minimal unit of allocation
Physical address: some kind of tuple that identifies a single block on disk
- Each disk block has a unique Logical Block Address, looks like a flat sequence of blocks
Kernel driver translates physical addresses to block addresses, the size of the blocks on the disk can differ from the block size the OS uses
Logical file address (LA)
- Distance/offset from the beginning of the file (address 0) in bytes

New cards

Contiguous Allocation

Each file occupies a set of contiguous disk blocks (all blocks that are part of a file are next to each other)
- Entire blocks are used independently of the file size
Accessing the blocks of data for a file requires the corresponding directory entry, name of the file, start block number, and number of blocks being used
Advantages
- Simple implementation, only needing starting location and length
- Efficient, entire file can be read in a single operation
Disadvantages
- Wasteful of space, external fragmentation (i.e. removing files) may require compaction
- Files cannot grow, need to know the size upfront as another file could be right next to the current one

<ul><li><p>Each file occupies a set of contiguous disk blocks (all blocks that are part of a file are next to each other)</p><ul><li><p>Entire blocks are used independently of the file size</p></li></ul></li><li><p>Accessing the blocks of data for a file requires the corresponding directory entry, name of the file, start block number, and number of blocks being used </p></li><li><p>Advantages</p><ul><li><p>Simple implementation, only needing starting location and length</p></li><li><p>Efficient, entire file can be read in a single operation</p></li></ul></li><li><p>Disadvantages</p><ul><li><p>Wasteful of space, external fragmentation (i.e. removing files) may require compaction</p></li><li><p>Files cannot grow, need to know the size upfront as another file could be right next to the current one</p></li></ul></li></ul><p></p>

New cards

Contiguous Allocation formula for mapping logical address to block address

Logical address of file A = LA

LA / Block size = Quotient Q and Remainder R

Block number B = Q + start address of file A in directory

Block offset D = R

Two disk accesses to get the data at address LA
- one to get start address of file A from directory
- one to read data in block B found at offset D

New cards

Linked-List Allocation

Each file is a linked list of blocks
- First bytes of a block contain a pointer to the next block
Blocks may be scattered anywhere on the disk
Advantages
- Simple implementation, only need starting addresses
- Free-space management system, no space is wasted/no external fragmentation
Disadvantages
- No efficient random access, as you must go block to block to find the data

<ul><li><p>Each file is a linked list of blocks</p><ul><li><p>First bytes of a block contain a pointer to the next block</p></li></ul></li><li><p>Blocks may be scattered anywhere on the disk</p></li><li><p>Advantages</p><ul><li><p>Simple implementation, only need starting addresses</p></li><li><p>Free-space management system, no space is wasted/no external fragmentation</p></li></ul></li><li><p>Disadvantages</p><ul><li><p>No efficient random access, as you must go block to block to find the data</p></li></ul></li></ul><p></p>

New cards

Linked-List Allocation formula for mapping logical address to block address

Logical address of file A = LA

LA / (Block size - # of bytes reserved for pointer) = Quotient Q and Remainder R

Block number B = Qth block in the linked chain of blocks, starting from the start block in directory

Block offset D = R + # of bytes reserved for pointer

Q + 1 disk accesses to get the data at address LA
- one access to get start block of file A from directory
- Q accesses for traversing Q blocks

New cards

Linked-List Allocation with Table

Variation of linked list allocation
Data blocks no longer store pointers, instead storing pointers in the File-Allocation Table (FAT)
In the table, each entry corresponds to a disk block number and contains a pointer to the next block or -1
Table on beginning of the disk but copied into the kernel memory for speed
Used on old DOS systems, as well as USB and UEFI system partitions
Advantage
- Faster random access (table in memory)
Disadvantage
- Entire table has to be in memory, so this is a problem for very large disks

<ul><li><p>Variation of linked list allocation</p></li><li><p>Data blocks no longer store pointers, instead storing pointers in the File-Allocation Table (FAT)</p></li><li><p>In the table, each entry corresponds to a disk block number and contains a pointer to the next block or -1</p></li><li><p>Table on beginning of the disk but copied into the kernel memory for speed</p></li><li><p>Used on old DOS systems, as well as USB and UEFI system partitions</p></li><li><p>Advantage</p><ul><li><p>Faster random access (table in memory)</p></li></ul></li><li><p>Disadvantage</p><ul><li><p>Entire table has to be in memory, so this is a problem for very large disks</p></li></ul></li></ul><p></p>

New cards

Linked-List Allocation with Table formula for mapping logical address to block address

Logical address of file A = LA

LA / Block size = Quotient Q and Remainder R

Block number B = Qth block in the linked chain of blocks, starting from the start block in directory

Block offset D = R

2 disk accesses + Q memory (FAT) accesses to get the data at address LA
- one disk access to get the start address of file A from directory
- Q memory accesses in the table to get B
- one disk access to read data from block B at offset R

New cards

Indexed Allocation

Associate block-sized data structure called index-node (inode) to each file
- Directory entries have a filename and corresponding inode number
- Lists the attributes of the file
- Lists the disk addresses of the file’s blocks
Last address(es) is another block of addresses to allow for dynamically expanding the inode on demand for bigger files
Used on many UNIX file systems and Windows NTFS
Advantages
- Efficient random access
- Dynamic access without external fragmentation
- Lower overhead than FAT
  - Only need the content of the inode in memory compared to all data blocks
Disadvantages
- How to support large files that do not fit the inode?

<ul><li><p>Associate block-sized data structure called index-node (inode) to each file</p><ul><li><p>Directory entries have a filename and corresponding inode number</p></li><li><p>Lists the attributes of the file</p></li><li><p>Lists the disk addresses of the file’s blocks</p></li></ul></li><li><p>Last address(es) is another block of addresses to allow for dynamically expanding the inode on demand for bigger files</p></li><li><p>Used on many UNIX file systems and Windows NTFS</p></li><li><p>Advantages</p><ul><li><p>Efficient random access</p></li><li><p>Dynamic access without external fragmentation</p></li><li><p>Lower overhead than FAT</p><ul><li><p>Only need the content of the inode in memory compared to all data blocks</p></li></ul></li></ul></li><li><p>Disadvantages</p><ul><li><p>How to support large files that do not fit the inode?</p></li></ul></li></ul><p></p>

New cards

Indexed Allocation with Table formula for mapping logical address to block address (assuming file is small enough to use direct block numbers in the inode)

Logical address of file A = LA

LA / Block size = Quotient Q and Remainder R

Block number B: loop up the Qth entry of the index table to obtain B

Block offset D = R

Three disk accesses to get the data at address LA
- one access to get inode address from directory
- one access for reading the index table of inode
- one access to get data for block B at offset D

New cards

Indirection in Indexed Allocation for growing/large files

inode can contain pointer to
- direct block (data block)
- single indirect (data block from disk now filled with block numbers to direct blocks/data blocks)
- double indirect (data block from disk now filled with block number to single indirect blocks)
- triple indirect and so on
Allows files to grow and be very large

<ul><li><p>inode can contain pointer to</p><ul><li><p>direct block (data block)</p></li><li><p>single indirect (data block from disk now filled with block numbers to direct blocks/data blocks)</p></li><li><p>double indirect (data block from disk now filled with block number to single indirect blocks)</p></li><li><p>triple indirect and so on</p></li></ul></li><li><p>Allows files to grow and be very large</p></li></ul><p></p>

New cards

An inode contains 10 direct addresses and these are 8 bytes each. Moreover it contains 2 single indirect addresses. All disk blocks are 1024B. What would the largest possible file be?

10 direct entries

2 single indirect addresses with 1024B / 8B = 128 direct entries each

= 266 direct entries * 1KB = 266KB largest possible file size

New cards

Directories Implementation

Map file name to:
- Contiguous allocation: the start block address of file
- Linked list: the start block of the file
- Indexed allocation: the index block
File attributes in a directory are either stored inside the directory with the filename (FAT) or instead in the inode (EXT4, NTFS)
Note that directories are stored as special files
- Content of a directory file is directory entries, which can be a directory itself
Directory blocks include one entry for each file within that directory (starting block number of a file)

<ul><li><p>Map file name to:</p><ul><li><p>Contiguous allocation: the start block address of file</p></li><li><p>Linked list: the start block of the file</p></li><li><p>Indexed allocation: the index block</p></li></ul></li><li><p>File attributes in a directory are either stored inside the directory with the filename (FAT) or instead in the inode (EXT4, NTFS)</p></li><li><p>Note that directories are stored as special files</p><ul><li><p>Content of a directory file is directory entries, which can be a directory itself</p></li></ul></li><li><p>Directory blocks include one entry for each file within that directory (starting block number of a file)</p></li></ul><p></p>

New cards

Directories implementation in inode file systems

Every file and directory is represented by an inode
Inodes contain the metadata and the location of the file or directory data blocks on disk (note that inodes do not contain file names, the directory entry does)
For a directory, the data blocks store the directory’s file entries: file name + corresponding file number
Root directory “/” has a special inode number, usually 0 or 1

<ul><li><p>Every file and directory is represented by an inode</p></li><li><p>Inodes contain the metadata and the location of the file or directory data blocks on disk (note that inodes do not contain file names, the directory entry does)</p></li><li><p>For a directory, the data blocks store the directory’s file entries: file name + corresponding file number</p></li><li><p>Root directory “/” has a special inode number, usually 0 or 1</p></li></ul><p></p>

New cards

How to find the inode on the disk

Beginning part of the file system (after the superblock) contains all the inodes

block_addr(inode_num) = block_offset_of_first_inode + (inode_num * inode_size)

New cards

Hard link

Having two different names for the same file that are in different directories or same directory
- Possible due to different directory entries storing different file names but pointing to the same inode
- All names must be on the same file system (disk partition) since each file system has its own inodes
- The inode itself keeps a counter of how many filenames/entries correspond with the inode, only when the count reaches zero is the file deleted
  - Note that processes that are currently using the file will also increment the counter
  - “.” and “..” can also increment the counter in directories as they are links

<ul><li><p>Having two different names for the same file that are in different directories or same directory</p><ul><li><p>Possible due to different directory entries storing different file names but pointing to the same inode</p></li><li><p>All names must be on the same file system (disk partition) since each file system has its own inodes</p></li><li><p>The inode itself keeps a counter of how many filenames/entries correspond with the inode, only when the count reaches zero is the file deleted </p><ul><li><p>Note that processes that are currently using the file will also increment the counter</p></li><li><p>“.” and “..” can also increment the counter in directories as they are links</p></li></ul></li></ul></li></ul><p></p>

New cards

Symbolic (Soft) Links

Like a Windows shortcut, a special kind of file indicated by file attribute (special “symlink” bit set) which the content is a string that indicates the location/path name of another file, not inode

<p>Like a Windows shortcut, a special kind of file indicated by file attribute (special “symlink” bit set) which the content is a string that indicates the location/path name of another file, not inode</p>

New cards

Registers

Utilized by kernel device drivers for reading/writing for:
- Configuration / initialization (control register)
- Read/write data
- Check status of device (status register)
Used for communication with CPU with I/O Ports

<ul><li><p>Utilized by kernel device drivers for reading/writing for:</p><ul><li><p>Configuration / initialization (control register)</p></li><li><p>Read/write data</p></li><li><p>Check status of device (status register)</p></li></ul></li><li><p>Used for communication with CPU with I/O Ports</p></li></ul><p></p>

New cards

Buffers

Kernel device driver reads/writes them to move data
- ex: GPU RAM for graphics computations and display (framebuffer), packet buffer on a Network Interface Controller
Used for communication with CPU with Memory Mapped
I/O (in same address space as RAM

New cards

I/O Ports

Each controller register has an I/O port number
Special privileged instructions to access the I/O port space (kernel only, written in assembly), so devices and memory are accessed with different instructions
Separate I/O space and memory space (RAM)

New cards

Memory-mapped I/O

All controller registers and buffers mapped into the normal RAM address space
Each controller register is assigned a unique memory address
- No actual RAM memory for this address
Such addresses may be at the top of the physical address space
Advantages
- Software can read/write to I/O devices as to any other memory location (can just use pointers)
- Protection of I/O devices registers and buffers is the same as memory protection
Disadvantages
- Need to disable caching for memory-mapped I/O
- For one address, RAM controller and I/O devices will all examine the memory reference

New cards

Steps for CPU to send data to I/O device

CPU wants to read a word from either memory or I/O port:

Puts the address it needs on the bus address lines
Asserts a read signal on a bus control line
A second signal line to tell I/O space or memory space

If memory space, memory responds to the request
If I/O space, I/O device responds to the request
If there is only memory space (no I/O ports)
- RAM controller and every I/O device compares the address lines to the range of addresses that it services
- If the address falls in its range, it responds to the request
- Since no address is assigned to both memory and I/O device, there should be no conflict

New cards

Direct Memory Access (DMA) controller

Transfers data for the CPU
- From/to an I/O Device
- Between I/O devices
- CPU offloads bigger data transfers here and does something else in the meantime
Contains registers to be read/written by kernel
- Memory address register
- Byte count register
- Control registers to indicate the direction of transfer, transfer unit, etc.
CPU writes to the control registers to program the controller to do a transfer
Can do multiple data transfers at once through multiple channels
- Multiple sets of registers, one for each transfer channel
- CPU loads all sets of registers
- Can determine next request to serve via Round-robin or priority schema
Good because they are able to read and write data directly from/to memory and/or I/O devices and is able to be done without using CPU except for initial configuration

<ul><li><p>Transfers data for the CPU</p><ul><li><p>From/to an I/O Device</p></li><li><p>Between I/O devices</p></li><li><p>CPU offloads bigger data transfers here and does something else in the meantime</p></li></ul></li><li><p>Contains registers to be read/written by kernel</p><ul><li><p>Memory address register</p></li><li><p>Byte count register</p></li><li><p>Control registers to indicate the direction of transfer, transfer unit, etc.</p></li></ul></li><li><p>CPU writes to the control registers to program the controller to do a transfer</p></li><li><p>Can do multiple data transfers at once through multiple channels</p><ul><li><p>Multiple sets of registers, one for each transfer channel</p></li><li><p>CPU loads all sets of registers</p></li><li><p>Can determine next request to serve via Round-robin or priority schema</p></li></ul></li><li><p>Good because they are able to read and write data directly from/to memory and/or I/O devices and is able to be done without using CPU except for initial configuration</p></li></ul><p></p>

New cards

Data Transfer Bus Word-at-a-time mode

DMA controller requests for transfer of one word
If the CPU wants the bus, it has to wait until transfer is done
- Also called cycle stealing

New cards

Data Transfer Bus Block/Burst mode

DMA controller acquires the bus, issues a series of transfers and then releases the bus
Advantage of faster data transfer
Disadvantage of blocking the CPU and other devices for a long time if a long burst occurs (unable to access main memory, but can access caches)

New cards

Fly-by mode

DMA controller commands the device controller to transfer data directly to the destination (data only moves once)

New cards

Flow-through mode

Data transferred passes through the DMA controller
- DMA controller first reads data into an internal buffer/register and then writes it to the destination
This scheme requires an extra bus cycle per word transferred, but it is more flexible
- Able to perform device-device copies
- Memory-to-memory copies

New cards

A DMA controller has five channels. The controller is capable of sending a 32-bit word request in 50 nanoseconds. A response takes equally long. How fast does the bus have to be to avoid being a bottleneck?

50 nanoseconds request + 50 nanoseconds reply = 100 nanoseconds per transfer = 10 million transfers per second

each transfer is 32 bits or 4 bytes, so 4 bytes * 10 million =

40 MB/sec needed to avoid bottleneck

New cards

External Interrupts

To interact with I/O devices
- ex: CPU requests a data transfer and the devices sends an interrupt when it’s ready
These are normal to have

<ul><li><p>To interact with I/O devices</p><ul><li><p>ex: CPU requests a data transfer and the devices sends an interrupt when it’s ready</p></li></ul></li><li><p>These are normal to have</p></li></ul><p></p>

New cards

Internal Interrupts

Occurs entirely within the CPU
Also called exceptions, and are used to handle exceptional conditions that occur during the execution of programs; such as:
- Divide by zero exception
- Arithmetic overflow
- Page fault
- Invalid instruction

<ul><li><p>Occurs entirely within the CPU</p></li><li><p>Also called exceptions, and are used to handle exceptional conditions that occur during the execution of programs; such as:</p><ul><li><p>Divide by zero exception</p></li><li><p>Arithmetic overflow</p></li><li><p>Page fault</p></li><li><p>Invalid instruction</p></li></ul></li></ul><p></p>

New cards

Interrupts

Used to talk with the CPU
Kernel (with the help from CPU) is responsible to save the state of the process, the kernel then processes the interrupt, and then the interrupted process (or a different one) is resumed after the interrupt is handled by the kernel
CPU checks for interrupts after executing every instruction
If an interrupt occurs, the hardware looks up the interrupt vector (configured by the kernel at boot time) to fetch a new program counter
The new program counter points to the code of the right kernel interrupt handler (software interrupt service routine, ISR), which is executed

<ul><li><p>Used to talk with the CPU</p></li><li><p>Kernel (with the help from CPU) is responsible to save the state of the process, the kernel then processes the interrupt, and then the interrupted process (or a different one) is resumed after the interrupt is handled by the kernel</p></li><li><p>CPU checks for interrupts after executing every instruction</p></li><li><p>If an interrupt occurs, the hardware looks up the interrupt vector (configured by the kernel at boot time) to fetch a new program counter</p></li><li><p>The new program counter points to the code of the right kernel interrupt handler (software interrupt service routine, ISR), which is executed</p></li></ul><p></p>

New cards

Goals of I/O software

Device independence
- Programs can access I/O device without having to specify the type of device in advance
Uniform naming
- Name of file or device should be a string independently of device
Error handling
- Errors should be handled as close to hardware as possible (controller first, kernel device driver, …, user program last)
Synchronous (blocking) vs asynchronous (interrupt-driven)
- Most physical I/O is asynchronous
Buffering
- Involves considering copying, can negatively impact I/O performance
Shareable vs dedicated devices
- Some I/O devices, like disks, can be used by many users at the same time
- Other devices, such as printers, have to be dedicated to a single user

New cards

Programmed I/O

CPU writes/reads a byte/word at a time from/to main memory to/from device
- CPU makes a request, then waits for the device to become ready
- Buses are only one byte/word wide, so the last few steps are repeated for larger transfers
CPU time is wasted as if the device is slow, the CPU may busy wait a long time, constantly polling the device to see if it’s ready to accept another byte/word
Applies to I/O ports, memory-mapped I/O, and hybrid
Advantage of being simple, good for small data transfers
Disadvantage of CPU being unable to do anything else until I/O ends, and inefficiency of polling
Used to implement get_user / put_user

<ul><li><p>CPU writes/reads a byte/word at a time from/to main memory to/from device</p><ul><li><p>CPU makes a request, then waits for the device to become ready</p></li><li><p>Buses are only one byte/word wide, so the last few steps are repeated for larger transfers</p></li></ul></li><li><p>CPU time is wasted as if the device is slow, the CPU may busy wait a long time, constantly polling the device to see if it’s ready to accept another byte/word</p></li><li><p>Applies to I/O ports, memory-mapped I/O, and hybrid</p></li><li><p>Advantage of being simple, good for small data transfers</p></li><li><p>Disadvantage of CPU being unable to do anything else until I/O ends, and inefficiency of polling</p></li><li><p>Used to implement get_user / put_user</p></li></ul><p></p>

New cards

Interrupt-driven I/O

Use of interrupts to let the processor know when the I/O device has completed an operation or has encountered an error, letting the CPU continue with other computations instead of waiting for the slower device
When the device interrupt is received, the CPU can do read/write operations to/from main memory to/from the device for one byte/word
Data transfers from I/O module to CPU, then to memory for (for read, vice versa for write)
Advantages of efficiency, no busy waiting/polling
Disadvantages of interrupt occurring on every character/data transfer, and interrupts take time, so it wastes CPU cycles

<ul><li><p>Use of interrupts to let the processor know when the I/O device has completed an operation or has encountered an error, letting the CPU continue with other computations instead of waiting for the slower device</p></li><li><p>When the device interrupt is received, the CPU can do read/write operations to/from main memory to/from the device for one byte/word</p></li><li><p>Data transfers from I/O module to CPU, then to memory for (for read, vice versa for write) </p></li><li><p>Advantages of efficiency, no busy waiting/polling</p></li><li><p>Disadvantages of interrupt occurring on every character/data transfer, and interrupts take time, so it wastes CPU cycles</p></li></ul><p></p>

New cards

I/O using DMA

DMA controller reads/writes directly from/to system memory
It has access to the system bus independent of the CPU
The CPU has to set up the operation in the DMAC (using either programmed I/O or interrupt-driven I/O)
DMAC does the transfer
Once the transfer is complete, the DMAC notifies the CPU with a single interrupt, so CPU only gets interrupted once at the end of the entire data transfer instead for each data transfer
Advantages of being better than interrupt-driven I/O as most work is done by DMA controller, great for large data transfers
Disadvantage of requiring extra hardware DMA controller (but is standard today)
Used to implement copy_from_user / copy_to_user

<ul><li><p>DMA controller reads/writes directly from/to system memory</p></li><li><p>It has access to the system bus independent of the CPU</p></li><li><p>The CPU has to set up the operation in the DMAC (using either programmed I/O or interrupt-driven I/O)</p></li><li><p>DMAC does the transfer</p></li><li><p>Once the transfer is complete, the DMAC notifies the CPU with a single interrupt, so CPU only gets interrupted once at the end of the entire data transfer instead for each data transfer</p></li><li><p>Advantages of being better than interrupt-driven I/O as most work is done by DMA controller, great for large data transfers</p></li><li><p>Disadvantage of requiring extra hardware DMA controller (but is standard today)</p></li><li><p>Used to implement copy_from_user / copy_to_user</p></li></ul><p></p>

New cards

A typical printed page of text contains 50 lines of 80 characters each. Imagine that a certain printer can print 6 pages per minute and that the time to write a character to the printer’s output register is so short that it can be ignored. Does it make sense to run this printer using interrupt-driven I/O if each character printed requires an interrupt that takes 50 μsec all-in to service?

50 lines * 80 characters = 4000 characters per page

6 pages per min * 4000 = 24000 characters per min = 400 characters per second, which also mean 400 interrupts per second

50 μsec * 400 = 20000 μsec just for interrupts per second

20000 μsec = 20 milliseconds so about 20/1000 miliseconds which is about 2% of a second of CPU time

It’s fine to use interrupt-driven I/O in this case

New cards

Hybrid

I/O port and memory-mapped I/O for the same device

New cards

Magnetic disks

HDD, Floppy disk
Moving parts (mechanical)
On physical media, reads and writes are equally fast
Cylinder, head, and sector is now obsolete
Different zones now have a different number of sectors/track, so block addressing is now independent of geometry: local block address

<ul><li><p>HDD, Floppy disk</p></li><li><p>Moving parts (mechanical)</p></li><li><p>On physical media, reads and writes are equally fast</p></li><li><p>Cylinder, head, and sector is now obsolete</p></li><li><p>Different zones now have a different number of sectors/track, so block addressing is now independent of geometry: local block address</p></li></ul><p></p>

New cards

Solid-State drives

Different technologies
- NOR
- NAND
- 3D XPoint
- Memristor
Multiple interfaces
- USB
- SATA, mSATA
- NVMe (M.2, PCIe)
Nothing is moving (electronic), no mechanical issues
On physical media reads are faster than writes

New cards

Optical disks

CD, DVD, Blu-ray
Moving parts (mechanical)
Write-once, read multiple times
- If (re-)writeable, writing is way slower than reading

New cards

Problems with a single storage device

Slow (in comparison to other components of a computer like CPU)
Modest capacity compared to organizational needs
Unreliable
- Discs can totally fail (unable to read/write data)
- Bad sectors (cannot read what you have written, spare sectors are available but only a limited quantity)

New cards

RAID (Redundant Array of Inexpensive Disks)

Build a storage system as an array of drives, distributing data over multiple drives
- Allows data to be read in parallel and be replicated
Appears like a single disk, the system is unaware you are using it, so nothing needs to change
Applies to any storage technology

<ul><li><p>Build a storage system as an array of drives, distributing data over multiple drives</p><ul><li><p>Allows data to be read in parallel and be replicated</p></li></ul></li><li><p>Appears like a single disk, the system is unaware you are using it, so nothing needs to change</p></li><li><p>Applies to any storage technology</p></li></ul><p></p>

New cards

Hardware RAID controller card

Each drive is connected to the RAID controller
RAID controller can issue read/write in parallel across all disks

New cards

Software RAID

Implemented at the operating system level
Cheaper but slower, and typically fewer RAID levels/configurations supported in OS than in specialized hardware

New cards

RAID 0

Striping
Data striped (split) across disks
- Total storage space across all disks is divided into strips, where every strip includes k sectors
- Strips are mapped round-robin to consecutive disks, and multiple strips can be read/written in parallel (parallelism)
No redundancy, if a disk fails you lose a percentage of every file
Small requests result in no performance gains, as if software asks for one sector at a time, there is no parallelism
Bad reliability than a single disk, mean time to failure will be shorter with the more disks added

<ul><li><p>Striping</p></li><li><p>Data striped (split) across disks</p><ul><li><p>Total storage space across all disks is divided into strips, where every strip includes k sectors</p></li><li><p>Strips are mapped round-robin to consecutive disks, and multiple strips can be read/written in parallel (parallelism)</p></li></ul></li><li><p>No redundancy, if a disk fails you lose a percentage of every file</p></li><li><p>Small requests result in no performance gains, as if software asks for one sector at a time, there is no parallelism</p></li><li><p>Bad reliability than a single disk, mean time to failure will be shorter with the more disks added</p></li></ul><p></p>

New cards

RAID 1

Redundancy by duplicating all the data
Every disk has a mirror disk, storing the same data
- A read can be serviced by either of the two disks that contain the requested data
  - Up to twice as fast as RAID 0 for reading with parallelism
- A write must be executed on both disks, but can be done in parallel, so it’s as fast as RAID 0 for writing
Needs double the disks (expensive)
Recovery consists of simply installing a new drive and copying the entire mirror drive to it, which can take a long time

<ul><li><p>Redundancy by duplicating all the data</p></li><li><p>Every disk has a mirror disk, storing the same data</p><ul><li><p>A read can be serviced by either of the two disks that contain the requested data</p><ul><li><p>Up to twice as fast as RAID 0 for reading with parallelism</p></li></ul></li><li><p>A write must be executed on both disks, but can be done in parallel, so it’s as fast as RAID 0 for writing</p></li></ul></li><li><p>Needs double the disks (expensive)</p></li><li><p>Recovery consists of simply installing a new drive and copying the entire mirror drive to it, which can take a long time</p></li></ul><p></p>

New cards

RAID 2

Works on a byte/word basis (not full sector)
Stripes (splits) each byte/word into nibbles (small group of bits), with one bit per disk
- Addition of Hamming code bits, one bit per extra disk (error correction code)
- Hamming code is used to reconstruct the original data (but Complex and HDDs have internal error correction already)
All drives write in synch the bits for the same nibble
Complex algorithm, all drives have to be rotationally in synch
- Needs very performant disk controllers, and is more complex and expensive
Many disks are required for a good ratio of data to Hamming bits
Never used in practice due to these drawbacks

<ul><li><p>Works on a byte/word basis (not full sector)</p></li><li><p>Stripes (splits) each byte/word into nibbles (small group of bits), with one bit per disk</p><ul><li><p>Addition of Hamming code bits, one bit per extra disk (error correction code)</p></li><li><p>Hamming code is used to reconstruct the original data (but Complex and HDDs have internal error correction already)</p></li></ul></li><li><p>All drives write in synch the bits for the same nibble</p></li><li><p>Complex algorithm, all drives have to be rotationally in synch</p><ul><li><p>Needs very performant disk controllers, and is more complex and expensive</p></li></ul></li><li><p>Many disks are required for a good ratio of data to Hamming bits</p></li><li><p>Never used in practice due to these drawbacks</p></li></ul><p></p>

New cards

RAID 3

Same as RAID 2 but uses parity bit instead of Hamming code
Given N bits {b1, b2, … bN}, the parity bit P is the bit {0,1} that is an even number of “1” bits in the set {b1, b2, … bN, P}
- Parity bit = # of 1 bits in the set modulo 2
If any bit in {b1, b2, … bN} is lost, can use the remaining bits and P to recover it
Store the parity codes in an extra disk
Only able to handle one failing drive and is complex (but reasonable)
All drives have to be rotationally in synch for performance reasons, making it never used in practice

<ul><li><p>Same as RAID 2 but uses parity bit instead of Hamming code</p></li><li><p>Given N bits {b1, b2, … bN}, the parity bit P is the bit {0,1} that is an even number of “1” bits in the set {b1, b2, … bN, P}</p><ul><li><p>Parity bit = # of 1 bits in the set modulo 2</p></li></ul></li><li><p>If any bit in {b1, b2, … bN} is lost, can use the remaining bits and P to recover it</p></li><li><p>Store the parity codes in an extra disk</p></li><li><p>Only able to handle one failing drive and is complex (but reasonable)</p></li><li><p>All drives have to be rotationally in synch for performance reasons, making it never used in practice</p></li></ul><p></p>

New cards

RAID 4

Large strips with a parity strip (XOR)
- Similar RAID 3’s parity bit but done at block level rather than bit level, so synch drives are not required
- Similar to RAID 0 but with an additional disk for parity
If a drive crashes, lost bytes recomputed with parity
Only able to handle one failing drive
Performs poorly for small updates
- One sector changing on a single data disk drive needs to read the corresponding strips on all other data drives to recalculate the parity bits
- Or need to compute the difference between old sector and new sector to learn how parity bits must change, which then has to be rewritten on the parity drive
Heavy load on the parity drive, as it is rewritten for any change in the other drives
Rare to see in practice, only works well for small RAID systems

<ul><li><p>Large strips with a parity strip (XOR)</p><ul><li><p>Similar RAID 3’s parity bit but done at block level rather than bit level, so synch drives are not required</p></li><li><p>Similar to RAID 0 but with an additional disk for parity</p></li></ul></li><li><p>If a drive crashes, lost bytes recomputed with parity</p></li><li><p>Only able to handle one failing drive</p></li><li><p>Performs poorly for small updates</p><ul><li><p>One sector changing on a single data disk drive needs to read the corresponding strips on all other data drives to recalculate the parity bits</p></li><li><p>Or need to compute the difference between old sector and new sector to learn how parity bits must change, which then has to be rewritten on the parity drive</p></li></ul></li><li><p>Heavy load on the parity drive, as it is rewritten for any change in the other drives</p></li><li><p>Rare to see in practice, only works well for small RAID systems</p></li></ul><p></p>

New cards

RAID 5

Interleaved data and parity blocks
- Rotate the assignment of data blocks and parity blocks among the drives
- Avoids the bottleneck of a single disk being used for parity data
- Allows multiple reads/writes to occur in parallel
Only able to handle one failing drive
Reconstructing the contents of a failed drive is a complex process, and possibly slow
Commonly used in practice for mid-size RAID systems

<ul><li><p>Interleaved data and parity blocks</p><ul><li><p>Rotate the assignment of data blocks and parity blocks among the drives</p></li><li><p>Avoids the bottleneck of a single disk being used for parity data</p></li><li><p>Allows multiple reads/writes to occur in parallel</p></li></ul></li><li><p>Only able to handle one failing drive</p></li><li><p>Reconstructing the contents of a failed drive is a complex process, and possibly slow </p></li><li><p>Commonly used in practice for mid-size RAID systems</p></li></ul><p></p>

New cards

RAID 6

Same idea as RAID 5 but an additional parity block is used
- The data is striped across the disks with two parity blocks instead of one, one regular parity and some other kind
- Writes are a bit more expensive because of the parity calculations
- Reads incur no performance penalty
Protects against two simultaneous disk failures, so commonly used for very large RAID arrays

<ul><li><p>Same idea as RAID 5 but an additional parity block is used</p><ul><li><p>The data is striped across the disks with two parity blocks instead of one, one regular parity and some other kind</p></li><li><p>Writes are a bit more expensive because of the parity calculations</p></li><li><p>Reads incur no performance penalty</p></li></ul></li><li><p>Protects against two simultaneous disk failures, so commonly used for very large RAID arrays</p><p></p></li></ul><p></p>

New cards

RAID 0+1

Striped disks are mirrored
When one disk fails, one strip is down and the whole system will fail if any other disk fails on the other side of the mirror, not too reliable
- Losing one disk in the RAID 0 system causes it to stop entirely, and transitions to using the mirrored disks

<ul><li><p>Striped disks are mirrored</p></li><li><p>When one disk fails, one strip is down and the whole system will fail if any other disk fails on the other side of the mirror, not too reliable</p><ul><li><p>Losing one disk in the RAID 0 system causes it to stop entirely, and transitions to using the mirrored disks</p></li></ul></li></ul><p></p>

New cards

RAID 1+0

Mirrored pairs are striped
After one disk fails, no pair is down, and the whole system will fail only if the other disk in the same pair fails

<ul><li><p>Mirrored pairs are striped</p></li><li><p>After one disk fails, no pair is down, and the whole system will fail only if the other disk in the same pair fails</p></li></ul><p></p>

New cards

RAID 5+0

Strips with distributed parity and then striped
- Stripes across RAID 5 groups
Better reliability due to RAID 5 being able to handle one failed drive before data loss, but striping increases that to two failed drives (one in each strip)

<ul><li><p>Strips with distributed parity and then striped</p><ul><li><p>Stripes across RAID 5 groups</p></li></ul></li><li><p>Better reliability due to RAID 5 being able to handle one failed drive before data loss, but striping increases that to two failed drives (one in each strip)</p></li></ul><p></p>

New cards

RAID 6+0

Strips with double distributed parity and then striped
- Stripes across RAID 6 groups
Better reliability due to RAID 6 being able to handle two failed drive before data loss, but striping increases that to four failed drives (two in each strip)

<ul><li><p>Strips with double distributed parity and then striped</p><ul><li><p>Stripes across RAID 6 groups</p></li></ul></li><li><p>Better reliability due to RAID 6 being able to handle two failed drive before data loss, but striping increases that to four failed drives (two in each strip)</p></li></ul><p></p>

New cards

Mean Time to Failure (MTTF) formula

MTTF disk_array = MTTF single_disk / # disks

New cards

A RAID 5 array can fail if two or more if its drives crash within a short time interval. Suppose that the probability of one drive crashing in a given year is p. What is the probability of a k-drive RAID 5 array failing in a given year? (assume disks do not get replaced upon failure)

Probability P0 of all drives remaining healthy is (1-p)^k

Probability P1 of exactly one drive failing is kp * (1-p)^(k-1)

Probability of RAID 5 array failing in a given year P is

1 - P0 - P1 = 1 - (1-p)^k - kp * (1-p)^(k-1)

Assuming p = 1%, with k = 10, then P is 0.42%

New cards

Ways of dealing with bad sectors

Internal hardware controller
Operating system
Error correction code for small sector errors
RAID for whole disk-failures

New cards

Internal hardware controller (sector errors)

Used to test before the disk is shipped, will check for bad sectors, and for each bad sector will use spare sectors to substitute using a relocation table

<p>Used to test before the disk is shipped, will check for bad sectors, and for each bad sector will use spare sectors to substitute using a relocation table</p>

New cards

Operating system (sector errors)

Acquires a list of bad sectors, builds remapping tables, and uses the remapping tables when accessing disks