TĐ

Lecture 11 – Redirection, Streams & File Descriptors

Streams and Default File Descriptors

  • Stream
    • A "stream" = ordered sequence of characters flowing from a source to a sink.
    • Unix/C treat almost everything (files, terminals, pipes, sockets, etc.) as character streams.
  • Three default (a.k.a. standard) streams automatically opened for every process at start-up
    • 0 \;\rightarrow \; \texttt{stdin} – normally connected to the keyboard (read-only, buffered)
    • 1 \;\rightarrow \; \texttt{stdout} – normally connected to the screen (write-only, line-buffered)
    • 2 \;\rightarrow \; \texttt{stderr} – also the screen, but unbuffered so error messages appear immediately
  • File Descriptor (FD)
    • Integer handle maintained by the kernel; every open file (under the Unix “everything-is-a-file” idea) occupies one FD entry in the process’ FD table.
    • Each FD maps → entry in the kernel’s file table → inode table → actual file object.
    • New FDs are assigned the lowest unused non-negative integer (0,1,2 already in use).

What Is Redirection?

  • Shell syntactic sugar allowing you to replace the default stream of a process with another file/stream.
    • Example: ls > temp_file.txt
    • Shell closes FD 1 for the child, opens temp_file.txt, which obtains FD 1, so everything the program writes to stdout flows into that file.
    • Input redirection (<) replaces FD 0; output redirection (>, >>) replaces FD 1.
  • Mental model / phone-line metaphor:
    • Process starts with two phone lines: one for hearing (stdin) & one for speaking (stdout). You can hang up one line and re-dial another number (a file) on the same handset (FD number).

Manual Redirection in C: Basic Example

#include <stdio.h>
#include <fcntl.h>   // open, O_RDONLY, etc.
#include <stdlib.h>  // exit
int main() {
    int fd;            // will hold new FD number
    char line[100];

    /* Part 1 – read 3 lines from original stdin & echo */
    for (int i = 0; i < 3; ++i) {
        fgets(line, 100, stdin);
        printf("%s", line);
    }

    /* Part 2 – close stdin (FD 0) */
    close(0);

    /* Part 3 – open file; hope kernel returns FD 0 (dangerous assumption!) */
    fd = open("<a path>", O_RDONLY);
    if (fd != 0) {
        fprintf(stderr, "Could not open data as fd 0\n");
        exit(1);
    }

    /* Part 4 – same fgets now pulls from file */
    for (int i = 0; i < 3; ++i) {
        fgets(line, 100, stdin);
        printf("%s", line);
    }
    return 0;
}
  • Steps
    1. Use original stdin.
    2. close(0); frees FD 0.
    3. open() may hand back FD 0, but this is not guaranteed – another FD (e.g., 3) might be issued first.
    4. Subsequent fgets() still targets FD 0, so if assumption fails program breaks.

Reliable Redirection with dup2

  • \texttt{dup2(old_fd, new_fd)}
    • Closes new_fd if open, then makes it a duplicate of old_fd (both point to same kernel object, same file offset, etc.).
  • Improved snippet
fd = open("<a path>", O_RDONLY);     // returns (say) 3
int err = dup2(fd, 0);                // force FD 0 → same file
if (err == -1) {
    perror("dup2");
    exit(1);
}
close(fd);                             // optional; FD 0 remains open
  • Advantages
    • No reliance on next-available-FD rule.
    • Ensures exactly the FD number expected by higher-level stdio.
  • Related call: dup(old_fd) – returns the lowest available FD duplicating old_fd, without closing a specific target; in modern code dup2 (or dup3) preferred.

File Descriptors Survive exec()

  • When a process fork()s, the child inherits copies of all open FDs (same underlying kernel objects, reference counts++).
  • After that child performs exec(), those FDs remain open unless marked with the close-on-exec flag (FD_CLOEXEC).
  • Exploit: set up redirection before exec() so the new program automatically uses the substituted streams.

Example: Capturing ls Output in a File

int main() {
    pid_t id = fork();
    if (id == 0) {                // Child
        close(1);                 // free FD 1 (stdout)
        int fd = creat("ls_output", 0644); // get FD 1
        execlp("ls", "ls", NULL);  // runs /bin/ls with stdout → file
        perror("execlp");
        exit(1);
    } else if (id > 0) {          // Parent
        wait(NULL);
    }
}
  • creat() (or open with O_WRONLY | O_CREAT | O_TRUNC) returns FD 1 because it’s the lowest free descriptor after close(1). Any text ls writes to stdout is stored in ls_output.
  • Permissions 0644 = owner read/write, group read, others read.

Practical / Ethical / Real-World Notes

  • Shells, scripting languages, and most Unix utilities sit on top of the exact FD manipulations shown; understanding them lets you diagnose pipeline or redirection bugs.
  • Unbuffered stderr prevents error messages from being lost or delayed if stdout is redirected to a file or pipe.
  • Mismanaging FDs (e.g., leaving extra descriptors open) can leak resources or expose sensitive data when exec()ing privilege-escalated helpers; always close or set FD_CLOEXEC as appropriate.

Connections to Prior Material

  • Builds on earlier lectures covering:
    • fork() – process creation & parent/child relationships.
    • exec*() family – replacing process image.
    • Kernel data structures: inode, file tables.
    • Permission bits and octal representation (e.g., 0644).
  • Reinforces Unix philosophy: everything is a file & small programs glued by shell redirection form powerful workflows.

Key Takeaways / Cheat-Sheet

  • Default FDs: {0:stdin,\;1:stdout,\;2:stderr}.
  • Redirection = re-binding those numbers to other files/streams.
  • close(fd) frees a descriptor; next open() may reuse it, but dup2 gives deterministic control.
  • Setting up FDs before exec() lets child program inherit the redirection seamlessly.
  • Always handle errors (open, dup2, exec) & consider FD_CLOEXEC where leakage is dangerous.