1/54
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is a file?
It's a linear array of bytes, each of which you can read or write.
What are the types of names that a file has?
- inode
- path
- file descriptor
What is an inode?
The low-level name of a file. Usually just a number. The user is not aware of this number. They are unique but may be recycled after deletes.
What is a directory?
A file that also has an inode
What is the content of a directory?
A list of
How do you create a file in C?
With open() and passing in the O_CREAT flag
What does open() return?
It returns a file descriptor
What is a file descriptor?
Its an integer that is private per process which is used to access the file. Having a file descriptor and passing that to write/read functions shows the OS that you are allowed to do so. Therefore we can think about it as a capability.
Who manages file descriptors?
The OS, but a file descriptor created in a process is stored in the PCB of the process.
What files are always open in a process?
- Standard input
- Standard output
- Standard error
What is the standard input file used for?
The process can read it to receive input
What is the standard output file used for?
The process can write to it in order to dump information to the screen
What is the standard error file used for?
The process can write errors to it
Why are file descriptors a better way to interact with files than a path or an inode?
Because it only has to traverse the file system once and can then be reused until it is closed.
What are some functions that the file API includes?
- open()
- dup()
- read()
- write()
- close()
What does dup() do?
It allows a process to create a new file descriptor that refers to the same underlying open file as en existing descriptor (basically duplicating the file descriptor, but the duplicated one has a new file descriptor id)
What does close() do?
It deletes a file descriptor
How can you modify the offset that is currently being kept track of by a file descriptor?
By using lseek()
What are the possible values of the whence argument in lseek()?
- SEEK_SET
- SEEK_CUR
- SEEK_END
How is the offset of a file descriptor modified when using lseek() with the SEEK_SET whence argument?
The offset is set to the offset passed to lseek()
How is the offset of a file descriptor modified when using lseek() with the SEEK_CUR whence argument?
The offset is set to the offset in the file descriptor + the offset passed to lseek()
How is the offset of a file descriptor modified when using lseek() with the SEEK_END whence argument?
The offset is set to the size of the file + the offset passed to lseek()
What entries does a directory always have?
One that refers to itself (".", dot) and one that refers to its parent ("..", dot dot)
How is a file deleted for each of the different name levels of a file?
- inode (and associated file) is garbage collected when there are no references
- Paths are deleted when unlink() is called
- FSs are deleted when close() is called or when the process quits
How does a rename of a file work?
Since we only update the path, unlink() the previous path and link() the new path
What is important about the way rename() works?
It is implemented as an atomic call. Therefore if a crash happens, it either has its old name or its new name
What is the file system?
It refers to the collection of files and describes how files are mapped onto physical devices. It also refers to the part of the OS that manages those files.
What is the file system divided into?
Blocks (simple OS use fixed size block)
What is a typical size for a block in a file system for a simple OS?
4KB
What regions do we have in a file system?
- Data region (user data)
- inode table
- data bitmap
- inode bitmap
- superblock
Which region in a file system is usually the largest?
The data region (user data)
What is a common size of an inode?
256 bytes
Lets say we have place for 80 inodes in our inode table, how many files can the file system support?
80 files
How does an inode bitmap and a data bitmap work?
Its a simple structure where each bit is used to indicate whether the corresponding object/block is free (0) or in use (1)
What does the superblock contain?
It contains information about the current file system like how many inodes and data blocks there are in the file system, where the inode table begins and so forth.
What is an inode implicitly referred to by?
A number called the i-number (the low-level name)
What is a special about the i-number of an inode?
Given the number you should be directly able to calculate where on disk the corresponding inode is located.
How and by what is a disk addressable?
By a sector number (not byte addressable like memory!)
Given a sector size of 512 bytes on disk, and the byte address 20KB, how do you calculate the sector number where this address lies on disk?
(20 * 1024) / 512 = 40
What are the two formulas used in order to calculate a sector address sector?
blk = ( i-number * sizeof(inode) ) / blockSize
sector = ( ( blk * blockSize ) + iNodeStartAddr) / sectorSize
What are some information that an inode contains?
- type (regular file, directory etc)
- size
- number of blocks allocated to it
- protection information (who owns the file)
- who can access it
and more...
What do we refer to the kind of information that is in an inode as?
Metadata
How does an inode refer to where the data blocks that belong to that file are?
By using pointers to where they are stored on disk
What is a problem with using pointers to refer to where data blocks are stored?
If a file grows huge, the amount of pointers can exceed how many pointers an inode can hold.
What is a solution to the problem of having too many pointers to blocks in an inode?
Use all of the pointers as direct pointers (point to where in memory the block is) and then the last one acts as an indirect pointer (point to a data structure in the data region that holds more pointers)
Given that a system uses 4KB blocks and 4-byte disk addresses, how many pointers can an entire block hold when it acts as an indirect pointer?
4096 / 4 = 1024 pointers
We are calculating the max size a file can hold from pointers and indirect pointers, what do each of the numbers in this formula describe?
(12 + 1024 + 1024^2 ) Ă— 4 KB)
- 12 direct pointers
- 1 indirect pointer that leads to 1024 pointers
- 1 indirect pointer that leads to 1024 indirect pointers that leads to another 1024 pointers each
- each pointer is multiplied by the block size of the system
Why do we only keep a small number of direct pointers on an inode?
Because most files are small so we optimize for this case
Where do we find an unallocated inode to use for a new file and what do we have to do when we allocate it?
We find it in the inode bitmap. When allocating an inode we have to change its bit to 1 in the bitmap to indicate that it is used.
What does a traversal in order to open a file in a file system look like?
1) Start at the base of the path and in the root directory. Read into the block that contains the inode number (also known as i-number) of the root number (usually hardcoded to 2 in most systems).
2) Once the inode of the i-number previously found is read in (was 2 in the first iteration, use the pointers to blocks in there to find an entry that matches the next part of the path, which will give the i-number of that.
3) Recursively use step 2 until you end up with the inode of the file that you wish to open.
4) Call open() with that inode to retrieve a file descriptor.
What five I/O's does a write to a file logically generate?
One read to the data bitmap, one write to the bitmap, two to read and then write the inode (to add the new blocks location) and finally one write to the actual block itself.
How can systems remedy the huge performance problem that would occur from reading files?
It caches important blocks in the system memory (DRAM)
How can systems remedy the huge performance problem that would occur from writing files?
Buffer the writes and then write at a later point. That way multiple writes may first end up in the buffer, and then only one flush needs to happen in order to actually write to disk.
How long do most modern file systems buffer writes before propagating it to disk?
5-30 seconds
What function can be used in order to force buffered writes to disk?
fsync()