File Systems and Structures

Layered File System

  • Logical File System

    • Maintains file structure via FCB (File Control Block)

  • File Organization Module

    • Translates logical block to physical block

  • Basic File System

    • Converts physical block to disk parameters

    • Example Parameters: drive 1, cylinder 73, track 2, sector 10

  • I/O Control

    • Transfers data between memory and disk

Physical Disk Structure

  • Parameters to read from disk:

    • Cylinder (Track) #: Represents a circular path on the disk.

    • Platter (Surface) #: Refers to the individual disk surface.

    • Sector #: Smallest division of a disk that holds data.

    • Transfer Size: Amount of data read/written in a single operation.

File System Units

  • Sector

    • The smallest unit that can be accessed on a disk (typically 512 bytes).

  • Block (or Cluster)

    • The smallest unit that can be allocated to construct a file.

    • File Size on Disk:

      • A 1-byte file takes at least one cluster.

      • Cluster may consist of 1 to 8 sectors.

      • Thus, a 1-byte file may require approximately 4KB of disk space.

Sector-Cluster-File Layout

  • 1 Sector

    • Size: 512 Bytes

  • 1 Cluster

    • Size: 2048 Bytes (4 Sectors)

    • Used Space and Wasted Space Analysis:

    • 1024 Byte File uses 1 cluster - 50% of the cluster is wasted.

    • 2560 Byte File uses 2 clusters - 1536 Bytes wasted (40% waste).

    • 4608 Byte File uses 3 clusters - 1536 Bytes wasted (33% waste).

File Control Block (FCB)

  • Definition: Contains metadata for files managing their storage on disk.

  • Contents of FCB include:

    • Permissions: Specifies access rights to files.

    • Dates:

    • Create date

    • Access date

    • Write date

    • Owner: User owner of the file.

    • Group: Group owners permissions.

    • ACL (Access Control List): Specifies users/groups who can access file.

    • File Size: Indicates the number of bytes the file occupies.

    • Location of File Contents: Addressing information for file storage.

  • File System Examples:

    • UNIX File System: Uses I-node structure.

    • FAT/FAT32: Uses part of FAT (File Allocation Table).

    • NTFS: Uses part of MFT (Master File Table).

Partitions

  • Definition: Disks are segmented into one or more partitions.

  • Functionality: Each partition can implement its own file system method (such as UFS, FAT, NTFS, etc.).

Disk Layout for a File System

  • Super Block: Defines aspects of the file system structure.

    • Size of the file system.

    • Size of the file descriptor area.

    • Start location of the list of free blocks.

    • Location of the FCB of the root directory.

    • Other metadata such as permissions and timestamps.

  • Boot Block: Contains information for the system to boot from the disk.

Boot Block

  • Dual Boot: Enables multiple operating systems installation on one machine.

  • Boot Loader:

    • Understands multiple OS and file systems.

    • Located in specific area of the disk.

    • Reads the Boot Block to locate the boot image.

Block Allocation Strategies

  • Contiguous Allocation

  • Linked Allocation

  • Indexed Allocation

Contiguous Block Allocation

  • Directory Layout Example

    • Directory contains blocks ranging from 0 to 31 and shows allocation patterns for files.

  • Advantages of Contiguous Block Allocation:

    • Provides efficient read/seek operations due to spatial locality in disk.

    • Disk location for both sequential and random access can be efficiently derived.

Problems with Contiguous Block Allocation

  • Not knowing block requirements at file creation time can lead to disk fragmentation when contiguous space runs out.

Linked Block Allocation

  • Advantages:

    • Less fragmentation and provides flexible file allocation.

  • Disadvantages:

    • Sequential read operations will require disk seeks to access the next block.

    • Random read operations are inefficient, resulting in O(n) time complexity, where n is the number of blocks in the file.

Indexed Block Allocation

  • Definition: Keeps an array of pointers to blocks, facilitating easier random access.

  • Example System: UNIX File System employs this method.

Free Space Management

  • Challenge: Track free blocks when files are deleted.

  • Techniques for Management:

    • Bit Vector (or Bitmap):

    • Indicates free/occupied status for disk blocks.

    • Linked List:

    • A linked structure maintained to track available blocks.

Bit Vector (Bitmap)

  • A bitmap structure where each bit corresponds to a block status.

    • 0 indicates free; 1 indicates occupied.

  • Pros: Efficient with hardware support, allows bulk searching for free blocks.

  • Cons: Size increases with disk size, potentially inefficient if it cannot be fully loaded into memory.

Linked List for Free Space

  • Pros:

    • No need for global table.

  • Cons:

    • Substantial I/O may be required to access multiple free blocks as each needs to be accessed sequentially.

UNIX File Layout Overview

  • Components include:

    • File mode

    • Owner information

    • Timestamps for various metadata operations

    • Data blocks containing actual file data

  • Block Types:

    • Direct Blocks

    • Single Indirect, Double Indirect, Triple Indirect blocks for hierarchical data storage.

I-node Structure

  • Represents FCB for UNIX systems.

  • Contains pointers for accessing file data:

    • 12 direct block pointers.

    • 3 indirect pointers: single, double, triple.

  • Block size: typically 4KB allows for accessibility of 48 KB directly through i-node.

Addressing in I-node

  • Block Size: 4K leads to:

    • A single indirect block can address up to 4 MB of data.

    • A double indirect block can address up to 4 GB of data.

    • A triple indirect block can address up to 4 TB of data.

  • Note: Any block can be found through at most 3 levels of indirection.

UNIX Directory Structure

  • Similar to a file having a type field as a directory for access controls.

    • Contains tuples of .

Lookup Example in UNIX Directory

  • Example of how to resolve the path: "/usr/bob/mbox"

  • Steps involve:

    • Following directories to find corresponding i-node numbers.

    • Accessing the data blocks from the resolved i-nodes.

File System Maintenance

  • Maintenance Tasks Include:

    • Formatting: Setting up a new file system layout.

    • Bad Blocks Management: Disks accumulate bad blocks over time; these are kept in a list to manage usage.

    • Defragmentation: Reorganizing blocks for better access performance.

    • Scanning: Post-system crashes to ensure consistency in file descriptors.

File Allocation Table (FAT)

  • Definition: The File Allocation Table is the mechanism for file management in FAT systems.

  • Characteristics and Structure:

    • Typically stored at the top of the volume, with two copies for redundancy.

    • The cluster size is determined based on drive size (list of ranges provided).

Performance Constraints and Enhancements of FAT/FAT32

  • Limitations of FAT:

    • Maximum cluster count limited to 65,536 due to 16-bit entry.

    • Partition sizes capped at 2-4 GB.

    • Performance degradation for partitions over 200 MB.

    • Inefficiencies in space usage with larger clusters.

  • Enhancements in FAT32:

    • Uses 32-bit entries allowing for more clusters and improved space management.

    • Root folder can be treated as an ordinary cluster chain, improving flexibility.

NTFS (New Technology File System)

  • Master File Table (MFT): Analogous to FAT but significantly enhanced.

  • Design Objectives:

    • Fault tolerance through transaction logging.

    • Enhanced security with fine-grained access control.

    • Capacity for large disk sizes and efficient handling.

NTFS Advantages

  • Scalability: Utilizes 64-bit cluster addresses, extending support for vast disk sizes.

  • Reliability: Maintains a transaction log for consistency and recovery practices.

  • Metadata Support: Includes specialized files for managing various filesystem operations.

    • Files like $MFT, $LOGFILE, $VOLUME, $ATTRDEF, and others are integral to the structure of NTFS.

MFT Record Structure

  • Typical Fields in MFT Record:

    • Metadata headers for organization.

    • Attribute headers detailing specific file data.

    • Free space indicators for allocation efficiency.

Interaction Process between Applications and File System

  • Process Flow Outline:

    • File control block and initialization of file pointers.

    • Open file pointer array and system-wide open file tables created.

    • Closure of interactions is executed through specific file operation sequences such as open and read.

  • Comprehensive Open and Read Operations:

    • open(file…): Involves searching directories, copying descriptors, and creating entries in system structures.

    • read(file…): Operates through logical to physical reading, extant I/O operations directed by specific file descriptors.