File Systems and Structures
Layered File System
Logical File System
Maintains file structure via FCB (File Control Block)
File Organization Module
Translates logical block to physical block
Basic File System
Converts physical block to disk parameters
Example Parameters: drive 1, cylinder 73, track 2, sector 10
I/O Control
Transfers data between memory and disk
Physical Disk Structure
Parameters to read from disk:
Cylinder (Track) #: Represents a circular path on the disk.
Platter (Surface) #: Refers to the individual disk surface.
Sector #: Smallest division of a disk that holds data.
Transfer Size: Amount of data read/written in a single operation.
File System Units
Sector
The smallest unit that can be accessed on a disk (typically 512 bytes).
Block (or Cluster)
The smallest unit that can be allocated to construct a file.
File Size on Disk:
A 1-byte file takes at least one cluster.
Cluster may consist of 1 to 8 sectors.
Thus, a 1-byte file may require approximately 4KB of disk space.
Sector-Cluster-File Layout
1 Sector
Size: 512 Bytes
1 Cluster
Size: 2048 Bytes (4 Sectors)
Used Space and Wasted Space Analysis:
1024 Byte File uses 1 cluster - 50% of the cluster is wasted.
2560 Byte File uses 2 clusters - 1536 Bytes wasted (40% waste).
4608 Byte File uses 3 clusters - 1536 Bytes wasted (33% waste).
File Control Block (FCB)
Definition: Contains metadata for files managing their storage on disk.
Contents of FCB include:
Permissions: Specifies access rights to files.
Dates:
Create date
Access date
Write date
Owner: User owner of the file.
Group: Group owners permissions.
ACL (Access Control List): Specifies users/groups who can access file.
File Size: Indicates the number of bytes the file occupies.
Location of File Contents: Addressing information for file storage.
File System Examples:
UNIX File System: Uses I-node structure.
FAT/FAT32: Uses part of FAT (File Allocation Table).
NTFS: Uses part of MFT (Master File Table).
Partitions
Definition: Disks are segmented into one or more partitions.
Functionality: Each partition can implement its own file system method (such as UFS, FAT, NTFS, etc.).
Disk Layout for a File System
Super Block: Defines aspects of the file system structure.
Size of the file system.
Size of the file descriptor area.
Start location of the list of free blocks.
Location of the FCB of the root directory.
Other metadata such as permissions and timestamps.
Boot Block: Contains information for the system to boot from the disk.
Boot Block
Dual Boot: Enables multiple operating systems installation on one machine.
Boot Loader:
Understands multiple OS and file systems.
Located in specific area of the disk.
Reads the Boot Block to locate the boot image.
Block Allocation Strategies
Contiguous Allocation
Linked Allocation
Indexed Allocation
Contiguous Block Allocation
Directory Layout Example
Directory contains blocks ranging from 0 to 31 and shows allocation patterns for files.
Advantages of Contiguous Block Allocation:
Provides efficient read/seek operations due to spatial locality in disk.
Disk location for both sequential and random access can be efficiently derived.
Problems with Contiguous Block Allocation
Not knowing block requirements at file creation time can lead to disk fragmentation when contiguous space runs out.
Linked Block Allocation
Advantages:
Less fragmentation and provides flexible file allocation.
Disadvantages:
Sequential read operations will require disk seeks to access the next block.
Random read operations are inefficient, resulting in O(n) time complexity, where n is the number of blocks in the file.
Indexed Block Allocation
Definition: Keeps an array of pointers to blocks, facilitating easier random access.
Example System: UNIX File System employs this method.
Free Space Management
Challenge: Track free blocks when files are deleted.
Techniques for Management:
Bit Vector (or Bitmap):
Indicates free/occupied status for disk blocks.
Linked List:
A linked structure maintained to track available blocks.
Bit Vector (Bitmap)
A bitmap structure where each bit corresponds to a block status.
0 indicates free; 1 indicates occupied.
Pros: Efficient with hardware support, allows bulk searching for free blocks.
Cons: Size increases with disk size, potentially inefficient if it cannot be fully loaded into memory.
Linked List for Free Space
Pros:
No need for global table.
Cons:
Substantial I/O may be required to access multiple free blocks as each needs to be accessed sequentially.
UNIX File Layout Overview
Components include:
File mode
Owner information
Timestamps for various metadata operations
Data blocks containing actual file data
Block Types:
Direct Blocks
Single Indirect, Double Indirect, Triple Indirect blocks for hierarchical data storage.
I-node Structure
Represents FCB for UNIX systems.
Contains pointers for accessing file data:
12 direct block pointers.
3 indirect pointers: single, double, triple.
Block size: typically 4KB allows for accessibility of 48 KB directly through i-node.
Addressing in I-node
Block Size: 4K leads to:
A single indirect block can address up to 4 MB of data.
A double indirect block can address up to 4 GB of data.
A triple indirect block can address up to 4 TB of data.
Note: Any block can be found through at most 3 levels of indirection.
UNIX Directory Structure
Similar to a file having a type field as a directory for access controls.
Contains tuples of .
Lookup Example in UNIX Directory
Example of how to resolve the path: "/usr/bob/mbox"
Steps involve:
Following directories to find corresponding i-node numbers.
Accessing the data blocks from the resolved i-nodes.
File System Maintenance
Maintenance Tasks Include:
Formatting: Setting up a new file system layout.
Bad Blocks Management: Disks accumulate bad blocks over time; these are kept in a list to manage usage.
Defragmentation: Reorganizing blocks for better access performance.
Scanning: Post-system crashes to ensure consistency in file descriptors.
File Allocation Table (FAT)
Definition: The File Allocation Table is the mechanism for file management in FAT systems.
Characteristics and Structure:
Typically stored at the top of the volume, with two copies for redundancy.
The cluster size is determined based on drive size (list of ranges provided).
Performance Constraints and Enhancements of FAT/FAT32
Limitations of FAT:
Maximum cluster count limited to 65,536 due to 16-bit entry.
Partition sizes capped at 2-4 GB.
Performance degradation for partitions over 200 MB.
Inefficiencies in space usage with larger clusters.
Enhancements in FAT32:
Uses 32-bit entries allowing for more clusters and improved space management.
Root folder can be treated as an ordinary cluster chain, improving flexibility.
NTFS (New Technology File System)
Master File Table (MFT): Analogous to FAT but significantly enhanced.
Design Objectives:
Fault tolerance through transaction logging.
Enhanced security with fine-grained access control.
Capacity for large disk sizes and efficient handling.
NTFS Advantages
Scalability: Utilizes 64-bit cluster addresses, extending support for vast disk sizes.
Reliability: Maintains a transaction log for consistency and recovery practices.
Metadata Support: Includes specialized files for managing various filesystem operations.
Files like
$MFT,$LOGFILE,$VOLUME,$ATTRDEF, and others are integral to the structure of NTFS.
MFT Record Structure
Typical Fields in MFT Record:
Metadata headers for organization.
Attribute headers detailing specific file data.
Free space indicators for allocation efficiency.
Interaction Process between Applications and File System
Process Flow Outline:
File control block and initialization of file pointers.
Open file pointer array and system-wide open file tables created.
Closure of interactions is executed through specific file operation sequences such as
openandread.
Comprehensive Open and Read Operations:
open(file…): Involves searching directories, copying descriptors, and creating entries in system structures.read(file…): Operates through logical to physical reading, extant I/O operations directed by specific file descriptors.