M09-File Systems - Tagged

CS 439 Principles of Computing Systems

Overview of File System Design Components

  • File System Design Elements

  • User Interface and Interaction

  • Permissions & Security Considerations

  • Accounting and Resource Management

  • Reliability and Backup Strategies

  • Data Layout and Storage Performance

  • File System Compression and Logging

Key Elements of File System Design

File System API

  • Functions: read, write, seek, create

  • Naming: Hierarchical directory structure

  • User Interface: Easy navigation

  • Permissions: rwxrwxrwx (Read, Write, Execute)

  • Accounting: Maintenance of quotas

  • Reliability: Uses log-based structures

  • Backup: Full and incremental back-ups

  • Security: Encryption at file and volume levels

  • Resilience: Overcoming failures

  • Data Layout: Organizational structure of data

  • Blocking & Unit of Storage: Storage management

  • Performance: Attuned to specific hardware (e.g., SSD optimization)

  • Versioning: Mechanisms that allow file recovery

  • Basic Implementation: Underlying coding structure

  • Compression: Techniques like Zlib on file level

Examples of File Systems

MacOS APFS

  • API: read, write, seek, create

  • Structure: Hierarchical directory

  • Permissions: rwxrwxrwx

  • Accounting: quotas for resource usage

  • Reliability: log-based architecture

  • Backup: supports full and incremental changes

  • Security: encryption at both file and volume level

  • Resilience: Innovative recovery options

  • Data Layout and Performance: Optimized for SSD

  • Versioning: Integrated with Time Machine

  • Basic Implementation: Zlib for compression at the file level

ZFS

  • API: read, write, seek, create

  • Directory Structure: Hierarchical

  • Permissions: rwxrwxrwx

  • Accounting: quotas

  • Reliability: mirroring and RAID-Z features

  • Backup: full/incremental changes

  • Security: block-level encryption

  • Resilience and Performance: Advanced storage management

  • Versioning: Snapshot functionality

  • Basic Implementation: block-level compression strategies

Detailed Discussion

Flat File System Concept

  • Implements a simplistic file system without complex naming

  • Allows control for upper software layers

  • Maintains attributes: location, size, modified dates, user info, protection

Flat File System Examples

Example 1: Contiguous Allocation

  • Files allocated in contiguous storage blocks

  • Pros: Fast read/write operations

  • Cons: Disk space fragmentation; challenging for file growth

Example 2: Linked Allocation

  • Files form linked lists of blocks

  • Pros: Adapts easily for file size changes; avoids fragmentation

  • Cons: Slower random access due to traversal requirements

Example 3: File Allocation Table (FAT)

  • Used approach with a separate table for links between blocks

  • Facilitates organized block management

Example 4: Indexed Allocation

  • Uses multiple indexing strategies for access

  • Efficient management and retrieval of files

Adding Permissions and Features

  • I-nodes (index nodes) hold information per file: user ID, last accessed, modified dates, protection bits, type, size

Free Space Management Strategies

  • Bit Vectors: Simple approach using bits for block availability

  • Linked Lists: Chains all free blocks in a list for management

  • Challenges: Managing metadata and efficiency concerns

Advanced Concepts

Sparse Files

  • Files with "holes" allowing efficient storage of data

  • Challenges implementation via sequence or linked allocations

File Blocks

  • Size isn't uniform (e.g., Linux, APFS)

  • Trade-offs between performance, efficiency, storage capacity

File Extents

  • Allocating files in consecutive groups for efficiency.

  • Employed by modern file systems like EXT4, APFS

SSD Optimizations

  • Utilize low latency features of SSDs.

  • Use compression to minimize data writes.

  • Implement TRIM commands for enhanced performance.

Naming Structures and Hierarchies

  • File names mapped to inodes in a directed acyclic graph

  • Special names: use '/'; support hard and soft links

Hard Links and Soft Links

  • Hard links let files share the same inode references

  • Soft links indicate symbolic naming, allowing pointers to files

  • Care must be taken to avoid cycles in naming structures

Data Integrity and Reliability Issues

  • Challenges with sequential file updates leading to inconsistencies

  • Power failures can disrupt operations creating corruption

Logging and Recovery Techniques

General Logging Process

  • Log operations sequentially to track actions taken in the file system

  • Write "commit" records post-operation, ensuring data accuracy

Logging for SSDs

  • Need for compatibility with existing systems

  • Replication of efforts needs managed to avoid write amplifications

Failure Recovery Procedures

  • Check logs for current states

  • Confirm discrepancies and correct using logged operations

Implementation Considerations

  • Resource management for logs critical

  • Balancing write durability versus read performance

  • Measuring impacts on throughput and accuracy with logging strategies.

robot