TĐ

Lecture 12 – Pipes, Pipes & Redirection

Concept Recap – Streams & Redirection

  • Programs generally behave as transformations: an input stream → an output stream.
  • Redirection = changing where those streams come from / go to.
    • \text{stdin} \;(fd = 0) can be attached to a file instead of keyboard.
    • \text{stdout} \;(fd = 1) can be written to a file instead of terminal.
  • Pipes are a specialised, dynamic form of redirection between two live processes rather than between a process and a file.

Pipes in Bash – User-Level View

  • Syntax: prog1 | prog2
    → output of prog1 becomes input of prog2.
  • Typical example:
    ls | wc -l ⇒ prints the number of directory entries.
  • Shell handles the low-level plumbing transparently; user only sees the | token.

Pipes Under the Hood – Conceptual Model

  • Need a mechanism allowing concurrent producer & consumer processes.
  • Naïve shared-memory idea is unsafe (racing, blocking, address-space isolation).
  • Preferred solution: an OS-managed buffer (queue) that both processes access via file descriptors.
    • Producer blocks when buffer full.
    • Consumer blocks when buffer empty.
    • Behaviour analogous to a bounded FIFO queue.
  • ASCII metaphor in lecture:
    Process 1 → [ PIPE (buffer) ] → Process 2

Pipes in C – File Descriptors & Buffer Mechanics

  • Creation: int fd[2]; pipe(fd);
    • fd[0] = read end.
    • fd[1] = write end.
  • A pipe is therefore two file descriptors referencing the same kernel buffer.
  • Goal: make each child think that the appropriate end is its normal stdin/stdout.

Reader–Writer Protocol (dup2, close, exec)

  • WRITER process pattern:
    • close(fd[0]); // never reads
    • dup2(fd[1], 1); // stdout \Leftarrow fd[1]
    • close(fd[1]); // fd[1] now redundant
    • execvp("producer", …);
  • READER process pattern:
    • close(fd[1]); // never writes
    • dup2(fd[0], 0); // stdin \Leftarrow fd[0]
    • close(fd[0]);
    • execvp("consumer", …);
  • Why the extra close calls?
    • Any unused end kept open fools the kernel into thinking a reader/writer still exists → potential indefinite blocking.
  • dup2(oldFD,newFD) duplicates the object referenced by oldFD onto descriptor newFD, closing newFD first if open.

Putting It Together with fork() – Minimal Example

int main() {
    char *child_cmd[]  = {"ls", NULL};
    char *parent_cmd[] = {"wc", NULL};

    int my_pipe[2];
    if (pipe(my_pipe) == -1) perror("Cannot create pipe\n");

    pid_t pid = fork();
    if (pid < 0)  perror("Failed fork\n");

    if (pid > 0) {           // PARENT – runs wc
        close(my_pipe[1]);
        dup2(my_pipe[0], STDIN_FILENO);
        close(my_pipe[0]);
        wait(NULL);
        execvp("wc", parent_cmd);
    } else {                 // CHILD – runs ls
        close(my_pipe[0]);
        dup2(my_pipe[1], STDOUT_FILENO);
        close(my_pipe[1]);
        execvp("ls", child_cmd);
    }
}

Detailed Walk-Through of Example Code

  • Setup: two null-terminated command arrays for ls and wc.
  • pipe() system call fills my_pipe[2]; returns -1 on error.
  • fork() duplicates the current process; child receives pid = 0, parent gets child PID > 0.
  • Parent path (consumer):
    • Closes write end fd[1].
    • Redirects stdin (STDIN_FILENO = 0) to read end fd[0] via dup2.
    • Closes now-useless fd[0].
    • wait(NULL) blocks until child exits → prevents zombie; ensures orderly shutdown.
    • execvp("wc", …) overlays parent with wc.
  • Child path (producer): symmetrical operations then execvp("ls", …).
  • Result mimics shell behaviour: ls | wc.

Operational Limits, Errors & Edge Cases

  • Capacity: POSIX-defined minimum buffer size (often 4096 or 65536 bytes); implementation dependent.
  • Chunking: send large data in smaller writes to avoid blockage.
  • Error handling:
    • Writer with no active reader → write() returns -1, errno == EPIPE; SIGPIPE may be delivered.
    • Reader with no writer → read() blocks until all write ends closed; then returns 0 (EOF).
  • Multiple readers:
    • Pipe is FIFO; bytes are distributed in arrival order but scheduling vagaries can yield unpredictable distribution among readers.
  • Multiple writers likewise compete; atomicity is guaranteed only up to PIPE_BUF bytes (minimum 512) per POSIX.

Connections, Implications & Best Practices

  • Builds on earlier lectures covering file descriptors and process control (fork/exec).
  • Illustrates classic producer–consumer concurrency pattern.
  • Shows how shell conveniences map to explicit OS calls – valuable for debugging & systems programming.
  • Ethical / practical angle: misuse (e.g., unclosed descriptors) can cause hangs, resource leaks, or subtle race conditions.
  • Real-world relevance: same pattern underpins networking sockets, filters in UNIX pipelines, and micro-service message queues.
  • Best practices:
    • Always close() unused ends immediately.
    • Check system-call return values; respond to EPIPE.
    • Use SIGPIPE handling or MSG_NOSIGNAL where appropriate.
    • Keep writes ≤ PIPE_BUF for atomicity when multiple writers.