JB

File Systems

File Systems

Overview

  • File systems are essential for organizing and retrieving data on random access devices.
  • They act as an index, providing a structured way to access data.
  • Formatting a disk involves structuring it for a specific file system, allocating blocks for bookkeeping data structures.
  • Key file information is duplicated across the disk to prevent data loss.
    • This redundancy introduces a trade-off between security and potential inconsistencies.

Common File Systems

  • FAT (File Allocation Table):
    • Used by older Microsoft Windows versions and still common in portable USB storage devices.
  • EXT (Extended File System):
    • A family of file systems used in Linux, including EXT, EXT2, EXT3, and EXT4.
  • HFS+:
    • Used by Mac OS.

Hierarchical File Systems

  • Most file systems use a hierarchical structure, representable as a tree.
  • Root Directory:
    • The top-level directory in the file system, serving as the root of the directory tree.
  • Directories and Files:
    • Directories contain files and other directories.
    • In Windows, directories are referred to as folders (the concept remains the same).
    • Files can be text files or binary files.

Special Data Structures

  • Free Block List:
    • Maintains a list of available blocks for writing data.
    • Deleting a file marks its blocks as free without removing the content immediately.
    • This allows for potential recovery of deleted files using specialized tools.
    • Implemented as bitmap.
  • Metadata Blocks:
    • A contiguous cluster of blocks storing file metadata (information about files).
    • In POSIX systems, this is where inodes are stored.
    • The number of metadata blocks limits the maximum number of files that can be referenced on the disk.
    • The number of files is independent of the physical disk size and depends on the allocated space for metadata.

Directories as Files

  • Directories are special files that contain file name-location pairs for their member files.
  • Directory files are accessed like normal files and have their own metadata blocks (inodes in POSIX).
  • Directory files contain pointers to the files and directories they contain.
  • This structure enables the hierarchical file system.
  • Root Directory:
    • The top-level directory is the root of the directory tree.
  • Root Terminology:
    • Root can refer to the root user, root directory, or root process.

Finding Files

  • To locate a file, the OS needs to find the file's metadata block (inode), which contains pointers to the data blocks on the disk.
  • The OS starts searching from the root directory.
  • The OS traverses the directory tree, reading each directory file to find the location of the next directory file.
    • The last directory file contains the filename and the location of the file metadata block.
  • This process involves traversing the entire tree, opening one directory file after another.
  • Soft links (symbolic links) are supported by most file systems.
  • Hard links are a related concept in POSIX systems, but are not supported by Microsoft Windows.
  • Soft links allow files/directories to be accessible from multiple locations in the hierarchy.
  • They provide multiple paths to the same file through different branches of the file system.
  • A soft link is a new directory entry that points to another file or directory.
    • In Windows, it is often called a shortcut.
  • In Linux (EXT file systems), the metadata block of a soft link contains the path to the linked file or directory.
  • Deleting a soft link does not affect the original file or directory.
  • Deleting the original file leaves the soft link dangling (pointing to a nonexistent file).
  • Dangling references should be avoided.
  • Cyclic paths can be created with soft links which can lead to infinite loops.
  • Linux systems limit traversal to 20 symbolic links to avoid infinite loops.
    • This number is configurable.

Mounting File Systems

  • Mounting is commonly used with external drives (e.g., USB thumb drives) and network drives.
  • The OS must recognize the mass storage device and its file system.
  • The OS assigns a unique identifier, allowing users/applications to reference the root directory of the mounted file system.
  • Hardware is attached, whereas the file system is mounted.

Mounting in Different OS

  • Microsoft Windows:
    • Assigns a unique letter to each new file system (e.g., C for the main drive, D for data, H/J for network drives).
  • POSIX Systems:
    • The mounted file system becomes part of the main file system hierarchy.
    • The mount point is the location where the external drive is mounted.
    • The content of the mount point is replaced by the content of the mounted disk.