Module 8 - File Systems
Review: Mass Storage
Disk drives are the major secondary-storage I/O devices on most computers.
Disk scheduling algorithms can improve the effective bandwidth, the average response time, and the variance in response time.
Disks are frequently made redundant via RAID algorithms for the amount of storage required on large systems.
Tertiary storage is built from disk and tape drives that uses removable media.
Chapter 13: File-System Interface
File Concept
Access Methods
Directory Structure
File System Mounting
File Sharing
Protection
File Concept
Contiguous logical address space abstracts from the physical properties of its storage devices.
Types:
Data
Numeric
Character
Binary
Program
Building a File System
File System: Layer of OS that transforms blocks of disks (or other block devices) into Files, Directories, etc.
File System Components
Disk Management: collecting disk blocks into files
Naming: Interface to find files by name, not by blocks
Protection: Layers to keep data secure
Reliability/Durability: Keeping of files durable despite crashes, media failures, attacks, etc.
User vs. System View of a File
User’s view:
Durable Data Structures
System’s view (system call interface):
Collection of Bytes
System’s view (inside OS):
Collection of blocks (a block is a logical transfer unit, while a sector is the physical transfer unit)
/
/
Translating from User to System View
What happens if the user says: give me bytes 2—12?
Fetch block corresponding to those bytes
Return just the correct portion of the block
What about: write bytes 2—12?
Fetch block => Modify portion => Write out Block
Everything inside File System is in whole size blocks
For example, getc(), putc() buffers something like 4096 bytes, even if the interface is one byte at a time
From now on, a file is a collection of blocks in the File System.
File Attributes
Name – only information kept in human-readable form.
Type – needed for systems that support different types.
Location – pointer to file location on device.
Size – current file size.
Protection – controls who can do reading, writing, executing.
Time, date, and user identification – data for protection, security, and usage monitoring.
Information about files is kept in the directory structure, which is maintained on the disk.
File Types - Name, Extension
executable
usual extension: exe, com, bin or none
function: read to run machine-language program
object
usual extension: obj, o
compiled, machine language, not linked
source code
usual extension: c, cc, java, pas, asm, a
source code in various languages
batch
usual extension: bat, sh
commands to the command interpreter
text
usual extension: txt, doc
textual data, documents
word processor
usual extension: wp, tex, rrf, doc
various word-processor formats
library
usual extension: lib, a, so, dll
libraries of routines for programmers
print or view
usual extension: mpeg, mov, rm
ASCII or binary file in a format for printing or viewing
archive
usual extension: arc, zip, tar
related files grouped into one file, sometimes compressed, for archiving or storage
multimedia
usual extension: mpeg, mov, rm
binary file containing audio or A/V information
Designing the File System: Access Patterns
How do users access files?
Need to know the type of access patterns the user is likely to throw at the system.
Sequential Access: bytes read in order (“give me the next X bytes, then give me next, etc.”)
Almost all file access are of this flavor.
Random Access (Direct): read/write element out of the middle of the array (“give me bytes i—j”)
Less frequent, but still important. For example, virtual memory backing file: page of memory stored in file.
Want this to be fast – don’t want to have to read all bytes to get to the middle of the file.
Direct Access
sequential access
reset: cp = 0;
read next: read cp; cp = cp+1;
write next: write cp; cp = cp+1;
cp is a variable that defines the current position
Designing the File System: Access Patterns
Content-based Access: (“find me 100 bytes starting with KUBI”)
Example: employee records – once you find the bytes, increase my salary by a factor of 2
Many systems don’t provide this; instead, databases are built on top of disk access to index content (requires efficient random access)
Directory Structure
A collection of nodes containing information about all files.
Both the directory structure and the files reside on disk.
Backups of these two structures are kept on tapes.
Directory Structure (Cont.)
Disks are split into one or more partitions.
Each partition contains information about the files within it. This information is kept in entries in a device directory or volume table of contents.
record name, location, size, and type
Organize the Directory (Logically) to Obtain
Efficiency – locating a file quickly.
Naming – convenient to users.
Two users can have the same name for different files.
The same file can have several different names.
Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …)
Single-Level Directory
A single directory for all users.
Naming problem: unique file name for all users.
Grouping problem: hard to group for a user, and keeping track of hundreds of files is a big problem.
Two-Level Directory
Separate directory for each user.
Path name: a user name and a file name
Can have the same file name for different users
Efficient searching
No grouping capability
Tree-Structured Directories
Efficient searching
Grouping Capability
Current directory (working directory)
cd /spell/mail/prog
type list
Microsoft Windows family of OS maintains an extended two-level directory structure, with devices and volumes assigned drive letters.
Acyclic-Graph Directories
Have shared subdirectories and files.
Acyclic-Graph Directories (Cont.)
A tree structure prohibits the sharing of files or directories
An acyclic graph allows directories to have shared subdirectories and files.
An acyclic graph, that is, a graph with no cycles, is a natural generalization of the tree-structured directory scheme.
Implementation:
(UNIX) to create a new directory entry called a link, a pointer to another file or subdirectory.
Duplicate all information in both sharing directories.
Acyclic-Graph Directories (Cont.)
Two different names (aliasing) may refer to the same file.
If dict deletes list dangling pointer. Solutions:
If a system where sharing is implemented by a symbolic link
The deletion of a link does not need to affect the original file, only the link is removed
If a file entry is deleted, the space for the files is deallocated, leaving the links dangling.
Preserve the file until all references to it are deleted.
General Graph Directory
How do we guarantee no cycles?
Allow only links to file not subdirectories.
Garbage collection: determine when the last reference has been deleted and the disk space can be reallocated.
Every time a new link is added use a cycle detection algorithm to determine whether it is OK.
File System Mounting
A file system must be mounted before it can be accessed.
The operating system is given the name of the device, and the location within the file structure at which to attach the file system (or mount point). Typically, a mount point is an empty directory at which the mounted file system will be attached.
The operating system verifies that the device contains a valid file system.
The operating system notes in its directory structure that a file system is mounted at the specified mount point.
File Sharing
Sharing of files on multi-user systems is desirable.
Sharing may be done through a protection scheme.
owner and group
On distributed systems, files may be shared across a network.
Network File System (NFS) is a common distributed file-sharing method.
Protection
Safe from physical damage (reliability)
Solution: duplicate copies of files.
Improper access (protection)
Solutions: ability to access files
File owner/creator should be able to control:
what can be done
by whom
Types of access
Read
Write
Execute
Append
Delete
List
Access Lists and Groups
Mode of access: read, write, execute
Three classes of users
RWX a) owner access 7 1 1 1
RWX b) group access 6 1 1 0
RWX c) public access 1 0 0 1
owner group public
chmod 761 game
Attach a group to a file
chgrp G game
Chapter 14: File System Implementation
File System Structure
File System Implementation
Directory Implementation
Allocation Methods
Free-Space Management
Efficiency and Performance
Recovery
Log-Structured File Systems
File-System Structure
File structure
Logical storage unit
Collection of related information
File system resides on secondary storage (disks).
Files can be rewritten in place.
Files can be accessed directly any given block of information on the disk.
File control block – storage structure consisting of information about a file
A Typical File Control Block
file permissions
file dates (create, access, write)
file owner, group, ACL
file size
file data blocks
File control block - storage structure consisting of information about a file.
On-disk File System Structures
A boot control block: contains information needed by the system to book an operating system from that partition.
A partition control block: contains partition details, such as the number of blocks in the partition, the size of the blocks, free-block count and free block pointers, and free FCB count and FCB pointer.
A directory structure is used to organize the files
An FCB contains many of the file’s details
In-Memory File System Structures
An in-memory partition table containing information about each mounted partition.
An in-memory directory structure that holds the directory information of recently accessed directories
The system-wide open-file table contains a copy of the FCB of each open file
The per-process open-file table contains a pointer to the appropriate entry in the system-wide open-file table.
In-Memory File System Structures
Open system call:
Resolves file name, finds file control block (inode)
Makes entries in per-process and system-wide tables
Returns index (called “file handle”) in open-file table
Read/write system calls:
Use file handle to locate inode
Perform appropriate reads or writes
Virtual File Systems
Virtual File Systems (VFS) provide an object-oriented way of implementing file systems.
VFS allows the same system call interface (the API) to be used for different types of file systems.
The API is to the VFS interface, rather than any specific type of file system.
Directory Implementation
Linear list of file names with a pointer to the data blocks.
simple to program
time-consuming to execute
Hash Table – linear list with hash data structure.
decreases directory search time
collisions – situations where two file names hash to the same location
fixed size and the dependence of the hash function on that size
Allocation Methods
An allocation method refers to how disk blocks are allocated for files:
Contiguous allocation
Linked allocation
Indexed allocation
Contiguous Allocation
Each file occupies a set of contiguous blocks on the disk.
Simple – only starting location (block #) and length (number of blocks) are required.
Both sequential and direct access can be supported by contiguous allocation.
IBM VM/CMS operating system uses contiguous allocation because of good performance.
Contiguous Allocation (Cont.)
Problems:
Find space for a new file.
Wasteful of space (dynamic storage-allocation problem).
Files cannot grow.
Linked Allocation
Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk.
The directory contains a pointer to the first and last blocks of the file.
Simple – need only starting address
Free-space management system – no waste of space
Problems
No random access
Linked allocation is the space required for the pointers. If a pointer requires 4 bytes out of a 512-byte block, then 0.78 percent of the disk is being used for pointers, rather than for information.
Reliability, what if a pointer were lost or damaged.
pointer block \frac{4}{512} =
Linked Allocation (Cont.)
File-allocation table (FAT) – An important variation on the linked allocation method
FAT is used by MS-DOS and OS/2.
The directory entry contains the block number of the first block of the file, and the table entry indexed by that block number then contains the block number of the next block in the file.
Indexed Allocation
Linked allocation solves the external-fragmentation and size-declaration problems of contiguous allocation.
However, linked allocation (FAT) cannot support efficient direct access
Brings all pointers together into the index block.
Pros: Can easily grow up to space allocated for index and Random access is fast
Cons: Clumsy to grow file bigger than table size, still lots of seeks: blocks may be spread over disk index table
Indexed Allocation (Cont.)
Need index table
Random access
Dynamic access without external fragmentation, but have the overhead of the index block.
Mapping from logical to physical in a file of maximum size of 256K words and a block size of 512 words. We need only 1 block for the index table.
Indexed Allocation – Mapping
Mapping from logical to physical in a file of unbounded length (block size of 512 words).
Question: how large the index block should be?
as small as possible
Big to hold enough pointers
Schemes
Linked scheme – Link blocks of index table (no limit on size), the last word in the index block could be NIL (for a small file) or is a pointer to another index block (for a large file).
Indexed Allocation – Mapping (Cont.)
Multilevel index – A variant of the linked representation is to use a first-level index block to point to a set of second-level index blocks, which in turn point to the file blocks. This approach could be continued to a third or fourth level, depending on the desired maximum file size.
outer-index index table file
Indexed Allocation – Mapping (Cont.)
Combined scheme: Used in Unix 4.1 BSD, is to keep the first, say, 13 pointers of the index block in the file’s inode (header)
The first 10 of these pointers point to direct blocks. If the block size is 4KB, up to 48 KB of data may be accessed directly.
The next 3 pointers point to indirect blocks
The first is the address of a single indirect blocks, the single indirect block is an index block, containing not data, but rather the addresses of blocks that do contain data
A double indirect block pointer
A triple indirect block
Performance
Criteria
Storage efficiency
Data-block access times
Some systems support direct-access files by using contiguous allocation and sequential access by linked allocation.
Indexed allocation is more complex. Thus, the performance of indexed allocation depends on the index structure, on the size of the file, and on the position of the block desired.
Free-Space Management
Free space list: record all free disk blocks, those not allocated to some file or directory.
Easy to get contiguous files
Bit vector: each block is represented by 1 bit.
… 0 1 2 n-1
bit[i] =
\begin{cases}
1 & \text{if block[i] is free} \
0 & \text{if block[i] is occupied}
\end{cases}Simplicity and efficiency in finding the first free block or n consecutive free blocks on the disk
Free-Space Management (Cont.)
Bit vectors are inefficient unless the entire vector is kept in main memory
Linked list (free list)
Keeping a pointer to the first free block in a special location on the disk and caching it in memory
No waste of space
Not efficient
Cannot get contiguous space easily
Free-Space Management (Cont.)
Grouping
Store the addresses of n free blocks in the first free block
The first n-1 of these blocks are actually free
The last block contains the addresses of another n free blocks
Counting
Keep the address of the first free block and the number n of free contiguous blocks that follow the first block.
Efficiency and Performance
Efficiency dependent on:
disk allocation and directory algorithms
types of data kept in the file’s directory entry
Performance
disk cache – separate section of main memory for frequently used blocks
free-behind and read-ahead – techniques to optimize sequential access
improve PC performance by dedicating section of memory as virtual disk, or RAM disk.
Various Disk-Caching Locations
The difference between a RAM disk and a disk cache is that the contents of the RAM disk are totally user-controlled, whereas those of the disk cache are under the control of the operating system.
Page Cache
A page cache caches pages rather than file-system-oriented blocks using virtual memory techniques.
Memory-mapped I/O uses a page cache.
Routine I/O through the file system uses the buffer (disk) cache.
This leads to the following figure.
Unified Buffer Cache
A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O.
Recovery
Consistency checking – compares data in the directory structure with data blocks on disk and tries to fix inconsistencies.
Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape).
Recover lost file or disk by restoring data from backup.
Log Structured File Systems
Log structured (or journaling) file systems record each update to the file system as a transaction.
All transactions are written to a log. A transaction is considered committed once it is written to the log. However, the file system may not yet be updated.
The transactions in the log are asynchronously written to the file system. When the file system is modified, the transaction is removed from the log.
If the file system crashes, all remaining transactions in the log must still be performed.
Summary – ch. 13
File & File System
Access pattern
Sequential, direct, or content
Directory structure
One level, two levels, tree, acyclic graph, general graph
File mounting, sharing, protection
Summary – ch. 14
File system structure
FCB, on disk, in-memory, virtual file structure
Directory allocation
Contiguous, linked, index
Free space management
Bit vector, linked list, grouping, counting
Page cache
Recovery