File i/o and metadatsa

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/177

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 11:47 PM on 4/12/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

178 Terms

1
New cards

What does the open() system call do?

Opens or creates a file, returning a non-negative integer file descriptor. Returns -1 on error. Header:

2
New cards

What does a file descriptor represent in the UNIX kernel?

A non-negative integer used by the kernel to refer to all open files. Every open file is identified by its file descriptor.

3
New cards

What are the three standard file descriptors and their conventional values?

0 = standard input (STDIN_FILENO), 1 = standard output (STDOUT_FILENO), 2 = standard error (STDERR_FILENO). Defined in .

4
New cards

Which three open() flags are mutually exclusive and must have exactly one specified?

O_RDONLY (read only), O_WRONLY (write only), O_RDWR (read and write). Historically have values 0, 1, and 2.

5
New cards

What does the O_CREAT flag do when passed to open()?

Creates the file if it doesn't exist. Requires a third mode argument specifying access permission bits.

6
New cards

What does O_EXCL do when combined with O_CREAT in open()?

Causes open() to fail with an error if the file already exists. The test-for-existence and creation are performed as an atomic operation.

7
New cards

What does the O_TRUNC flag do?

If the file exists and is opened for write, its length is truncated to 0.

8
New cards

What does the O_APPEND flag guarantee?

Before each write, the file offset is atomically set to the current end of file, so every write appends to the file. Prevents race conditions in multi-process append scenarios.

9
New cards

What does O_NONBLOCK do?

Sets nonblocking mode for opening and subsequent I/O on a FIFO, block special file, or character special file.

10
New cards

What does O_SYNC do?

Makes each write wait for physical I/O to complete, including updating all file attributes. Every write is synchronous — data and attributes are updated together.

11
New cards

What is the difference between O_DSYNC and O_SYNC?

O_DSYNC waits for physical I/O but only updates attributes needed to read the data just written (e.g., file size). O_SYNC always updates all attributes on every write, even when overwriting existing bytes.

12
New cards

What does the creat() system call do and what is its equivalent using open()?

Creates a new file or truncates an existing one, returning a write-only file descriptor. Equivalent to open(path, O_WRONLY|O_CREAT|O_TRUNC, mode).

13
New cards

What is a key deficiency of creat()?

The file is opened only for writing. To read back a newly created temp file, you'd need creat(), close(), then open(). Better to use open(path, O_RDWR|O_CREAT|O_TRUNC, mode).

14
New cards

What does the close() system call do?

Closes an open file, releases the descriptor, and frees any record locks the process held. Returns 0 on success, -1 on error. All open files are automatically closed when a process terminates.

15
New cards

What is the current file offset and how is it initialized?

A non-negative integer measuring bytes from the beginning of the file. Initialized to 0 when a file is opened, unless O_APPEND is set.

16
New cards

What does lseek() do and what are its three whence values?

Explicitly sets an open file's current offset. SEEK_SET = from beginning; SEEK_CUR = from current position; SEEK_END = from end of file. Returns new offset or -1 on error.

17
New cards

What happens if lseek() is called on a pipe, FIFO, or socket?

Sets errno to ESPIPE and returns -1. These types cannot be seeked.

18
New cards

What is a 'hole' in a file?

When lseek() positions past the end of a file and data is written, bytes in between are never written. They read back as zeros but may not consume disk blocks.

19
New cards

What does the read() system call return?

Number of bytes actually read (>0), 0 at end of file, or -1 on error. Signature: ssize_t read(int fd, void *buf, size_t nbytes).

20
New cards

Name three situations where read() returns fewer bytes than requested.

1) EOF reached mid-read on a regular file; 2) Reading from a pipe/network with fewer bytes available; 3) Interrupted by a signal after partial read.

21
New cards

What does the write() system call return?

Number of bytes written if OK (usually equals nbytes), or -1 on error. Signature: ssize_t write(int fd, const void *buf, size_t nbytes).

22
New cards

What causes a write error in write()?

Filling up a disk or exceeding the file size limit for a process.

23
New cards

What are the five core unbuffered I/O functions in UNIX?

open(), read(), write(), lseek(), and close(). Each invocation is a direct system call.

24
New cards

What is 'unbuffered I/O' and where are these functions defined?

refers to a category of input and output functions where every operation—such as a read or write—invokes a direct system call within the operating system kernel. Unlike the Standard I/O Library, which collects data in a user-space buffer before communicating with the kernel, unbuffered I/O operates directly on file descriptors. Defined in POSIX.1 and the Single UNIX Specification (not ISO C).

25
New cards

What three kernel data structures represent an open file?

1) Process table entry (table of open file descriptors); 2) File table (status flags, current offset, v-node pointer); 3) V-node table (file type info and i-node).

26
New cards

What does each process table entry's file descriptor slot contain?

Two things: (a) the file descriptor flags (e.g., close-on-exec), and (b) a pointer to a file table entry.

27
New cards

What does a file table entry contain?

The file status flags (read, write, append, sync, nonblocking), the current file offset, and a pointer to the v-node table entry.

28
New cards

What is a v-node?

A kernel structure containing information about the file type and pointers to functions that operate on the file. For most files it also contains the i-node. Linux uses a generic i-node instead.

29
New cards

When two processes open the same file, how many file table entries and v-node entries are there?

Each process gets its own file table entry (with its own current offset), but they share a single v-node table entry.

30
New cards

What is an atomic operation?

An operation composed of multiple steps that either all complete or none complete — no process can interrupt mid-way. Critical for correct concurrent file access.

31
New cards

Why is O_APPEND preferred over lseek-then-write for appending in multi-process programs?

Without O_APPEND, two processes can interleave their lseek and write calls, causing data overwrites. O_APPEND makes the seek-to-end and write atomic.

32
New cards

What do pread() and pwrite() do?

Atomically combine lseek() + read() or write() at a given offset without updating the file pointer. Prevents interruptions and multiple processes from interfering with each other’s offsets.

33
New cards

What does dup() return and what does it share?

Returns the lowest available file descriptor number. The new descriptor shares the same file table entry — same status flags and offset. But each has its own file descriptor flags.

34
New cards

How does dup2(filedes, filedes2) differ from dup()?

dup2() lets you specify the new descriptor number. If filedes2 is already open, it is first closed. If filedes == filedes2, it returns filedes2 without closing it.

35
New cards

What is dup(fd) equivalent to using fcntl()?

fcntl(fd, F_DUPFD, 0). And dup2(fd, fd2) is equivalent to close(fd2); fcntl(fd, F_DUPFD, fd2), but dup2 is atomic.

36
New cards

What does the sync() function do?

Queues all modified block buffers for writing to disk and returns immediately — it does NOT wait for writes to complete. Called every ~30 seconds by the update daemon.

37
New cards

What does fsync() do?

Forces all modified data and attributes for a single file to be written to disk and waits for completion before returning. Used by databases for data integrity.

38
New cards

How does fdatasync() differ from fsync()?

fdatasync() flushes only the data portions — it does not synchronously update file attributes. fsync() updates both data and attributes.

39
New cards

What is 'delayed write' in the kernel?

When data is written to a file, the kernel copies it to a buffer cache and queues it for writing to disk later. This improves performance but risks data loss on crash.

40
New cards

What are the five purposes of fcntl()?

1) Duplicate a descriptor (F_DUPFD); 2) Get/set fd flags (F_GETFD/F_SETFD); 3) Get/set file status flags (F_GETFL/F_SETFL); 4) Get/set async I/O ownership to recievesigio or sigurg signals for a socket or terminal.(F_GETOWN/F_SETOWN); 5) Get/set record locks.

41
New cards

What is the only currently defined file descriptor flag?

FD_CLOEXEC — close-on-exec. When set, the descriptor is closed when the process calls exec(). The F_DUPFD command always clears this flag on the new descriptor.

42
New cards

What file status flags can be changed with fcntl(F_SETFL,...)?

O_APPEND, O_NONBLOCK, O_SYNC, O_DSYNC, O_RSYNC, O_FSYNC, and O_ASYNC. Access mode flags (RDONLY, WRONLY, RDWR) cannot be changed after opening.

43
New cards

How do you safely modify a file status flag with fcntl()?

Fetch existing flags with F_GETFL, OR in (or AND-NOT out) the desired bits, then set with F_SETFL. Never use F_SETFL alone — it can clear previously set flags.

44
New cards

What does F_GETOWN/F_SETOWN do?

Gets or sets the process ID or process group ID that receives SIGIO and SIGURG signals for asynchronous I/O on that file descriptor.

45
New cards

What is /dev/fd?

A special directory where opening /dev/fd/n is equivalent to duplicating file descriptor n. Allows programs to pass file descriptors as pathname arguments.

46
New cards

What header file is required for open(), creat(), and fcntl()?

#include

47
New cards

What header file is required for read(), write(), close(), lseek(), dup()/dup2(), sync()/fsync()?

#include

48
New cards

What is the return value signature for read()?

ssize_t read(int filedes, void *buf, size_t nbytes) — returns bytes read, 0 for EOF, -1 for error.

49
New cards

What is the return value signature for write()?

ssize_t write(int filedes, const void *buf, size_t nbytes) — returns bytes written or -1 on error.

50
New cards

What is the return value signature for lseek()?

off_t lseek(int filedes, off_t offset, int whence) — returns new file offset or -1 on error.

51
New cards

What is the return signature for open()?

int open(const char *pathname, int oflag, ... mode_t mode) — returns file descriptor >= 0, or -1 on error.

52
New cards

What does fcntl(fd, F_GETFL, 0) return and how do you get the access mode?

Returns the file status flags. Use val & O_ACCMODE to extract only the access mode bits, then compare to O_RDONLY, O_WRONLY, or O_RDWR.

53
New cards

What is the BUFFSIZE that gives best I/O efficiency for disk files?

The filesystem preferred I/O block size — matching st_blksize (commonly 4096 bytes). Smaller buffers cause many system calls; larger buffers offer minimal additional gain.

54
New cards

What is the O_RSYNC flag?

Makes each read() wait until any pending writes for the same portion of the file are complete before returning.

55
New cards

What does ioctl() do?

A catch-all for I/O operations that don't fit other system calls — device-specific control, terminal I/O, magnetic tape operations.

56
New cards

TRUE or FALSE: lseek() on a file causes I/O to take place immediately.

FALSE. lseek() only records the new offset in the kernel — it causes no I/O. The offset is used by the next read() or write().

57
New cards

TRUE or FALSE: File descriptor flags and file status flags have the same scope.

FALSE. File descriptor flags apply only to a single descriptor in a single process. File status flags apply to all descriptors in any process pointing to the same file table entry.

58
New cards

TRUE or FALSE: sync() waits for all disk writes to complete before returning.

FALSE. sync() merely queues the buffers and returns immediately. fsync() is the function that waits for completion.

59
New cards

TRUE or FALSE: Two processes opening the same file always share the same current file offset.

FALSE. Each process gets its own file table entry with its own current offset. They share the v-node but not the offset.

60
New cards

TRUE or FALSE: Negative file offsets are impossible on all UNIX systems.

FALSE. Some special devices allow negative offsets. Never test lseek() return with < 0 — compare with == -1.

61
New cards

TRUE or FALSE: O_CREAT

O_EXCL guarantees atomic file creation. | TRUE. The check for existence and the creation are performed as a single atomic operation by the kernel.

62
New cards

TRUE or FALSE: dup2() and close() followed by fcntl(F_DUPFD,...) are exactly equivalent.

FALSE. dup2() is atomic. The close+fcntl approach can have a signal delivered between the two calls. They also differ in errno behavior.

63
New cards

TRUE or FALSE: creat() opens the file for both reading and writing.

FALSE. creat() opens the file for write-only. Use open() with O_RDWR|O_CREAT|O_TRUNC to read and write.

64
New cards

TRUE or FALSE: When a process terminates, the kernel automatically closes all open file descriptors.

TRUE. All open file descriptors are closed by the kernel when the process exits.

65
New cards

TRUE or FALSE: fdatasync() is supported on all major platforms including FreeBSD and macOS.

FALSE. FreeBSD 5.2.1 and Mac OS X 10.3 do not support fdatasync(). All platforms support sync() and fsync().

66
New cards

TRUE or FALSE: The file descriptor returned by open() is always the lowest-numbered unused descriptor.

TRUE. This is guaranteed. Applications exploit this — e.g., close stdout (fd 1), then open a file knowing it will receive fd 1.

67
New cards

TRUE or FALSE: O_RDONLY, O_WRONLY, and O_RDWR are separate bits that can be ORed together.

FALSE. They are mutually exclusive with values 0, 1, and 2. Use O_ACCMODE mask with F_GETFL to extract the access mode.

68
New cards

TRUE or FALSE: O_SYNC and O_DSYNC are identical on all platforms.

FALSE. O_SYNC always updates data AND all attributes. O_DSYNC only updates attributes needed to read newly written data. Linux treats them the same; Solaris differentiates.

69
New cards

TRUE or FALSE: When using O_APPEND, you can still call lseek() to read from any position.

TRUE. O_APPEND only affects writes. lseek() for reads works normally.

70
New cards

---

71
New cards

*METADATA*

72
New cards

What do the three stat functions do?

stat() returns info about a named file. fstat() returns info about an open file descriptor. lstat() returns info about the symbolic link itself, not the file it points to.

73
New cards

What is the key difference between stat() and lstat() for symbolic links?

stat() follows symbolic links and returns info about the target. lstat() returns info about the symbolic link file itself.

74
New cards

List all fields of the struct stat structure.

st_mode (type+permissions), st_ino (inode#), st_dev (device#), st_rdev (for special files), st_nlink (link count), st_uid (owner UID), st_gid (owner GID), st_size (bytes), st_atime, st_mtime, st_ctime, st_blksize, st_blocks.

75
New cards

What does st_mode encode?

Both the file type and the nine permission bits (plus set-user-ID, set-group-ID, and sticky bit). Type is mode_t.

76
New cards

What are the seven file types in UNIX?

Regular file, directory file, block special file, character special file, FIFO (named pipe), socket, and symbolic link.

77
New cards

What macro tests for a regular file in st_mode?

S_ISREG(st_mode). Others: S_ISDIR, S_ISCHR, S_ISBLK, S_ISFIFO, S_ISLNK, S_ISSOCK. All defined in .

78
New cards

What is the difference between a block special file and a character special file?

Block special: buffered I/O in fixed-size units (disk drives). Character special: unbuffered I/O in variable-sized units. All devices are one or the other.

79
New cards

What is a FIFO?

A file type (also called a named pipe) used for communication between processes. Has a filesystem path but data passes through without disk storage.

80
New cards

What are the nine file permission bits divided into?

Three categories of three bits: user (owner) — read/write/execute; group — read/write/execute; other (world) — read/write/execute. All stored in st_mode.

81
New cards

What does execute permission mean on a directory?

It is the 'search bit' — lets you pass through the directory as a component in a pathname. Read permission lists contents; execute lets you search for a specific file.

82
New cards

To delete a file, what permissions are required?

Write permission AND execute permission in the directory containing the file. You do NOT need read or write permission on the file itself.

83
New cards

What are the four steps the kernel uses to check file access permissions?

1) If effective UID = 0 (superuser) → allowed; 2) If effective UID = file owner → check user bits; 3) If effective GID matches → check group bits; 4) Check other bits. Steps stop at first match.

84
New cards

What does the set-user-ID (SUID) bit do?

When an executable runs, its effective user ID becomes the file's owner UID. Enables programs like passwd to run with elevated privileges.

85
New cards

What does the set-group-ID (SGID) bit do?

On an executable: process's effective GID is set to the file's group owner when it runs. On a directory: new files created in it inherit the directory's group ID.

86
New cards

What are the six process IDs related to permissions?

Real user ID, real group ID (who we really are), effective user ID, effective group ID, supplementary group IDs (used for access checks), saved set-user-ID, saved set-group-ID (saved by exec).

87
New cards

What does access() do and how does it differ from open()?

Tests file accessibility based on real user/group IDs (not effective). open() uses effective IDs. Useful for set-UID programs to check if the real user can access a file.

88
New cards

What are the four mode constants for access()?

R_OK (read), W_OK (write), X_OK (execute), F_OK (file exists). Defined in .

89
New cards

What is the umask?

The file mode creation mask — a per-process value. Any permission bits that are ON in the umask are turned OFF in the mode of newly created files.

90
New cards

What does a umask of 022 mean?

Group write and other write bits are masked off. Files created with permissions 0666 will actually get 0644. A umask of 0 means no bits are masked.

91
New cards

Changing a process's umask — does it affect the parent process?

No. Changes to a process's umask do not propagate to its parent (typically the shell).

92
New cards

What do chmod() and fchmod() do?

Change file access permissions on an existing file. chmod() takes a pathname; fchmod() takes an open file descriptor. Effective UID must equal file owner UID or be superuser.

93
New cards

What is the sticky bit (S_ISVTX)?

Originally kept program text in swap. On directories today: a file can only be removed or renamed by the file's owner, directory's owner, or superuser (e.g., /tmp).

94
New cards

What do chown(), fchown(), and lchown() do?

Change the user and group owner of a file. lchown() changes the symbolic link itself. Only superuser can change owner; users can change group to one they belong to.

95
New cards

What does st_size represent?

For regular files: size in bytes. For symbolic links: length of the pathname it points to. For directories: an integer multiple of the directory entry size.

96
New cards

What are the three timestamps in struct stat?

st_atime (last access/read), st_mtime (last modification/write), st_ctime (last status change — inode change). NOT creation time.

97
New cards

What is st_ctime — is it creation time?

NO. st_ctime is the last file STATUS CHANGE time — updated when the inode changes (permissions, owner, link count, size). UNIX does not store a file creation time.

98
New cards

What does utime() do?

Modifies st_atime and st_mtime for a file. st_ctime is set to the current time automatically. Requires superuser or file ownership.

99
New cards

What is st_blksize and why does it matter?

The optimal I/O block size for the file system. Reading/writing in multiples of this gives best performance. Standard I/O uses this to choose buffer sizes.

100
New cards

What does link() do?

Creates a new hard link — a new directory entry pointing to the same inode. Both names are equally primary. Link count (st_nlink) is incremented.