Concept Recap – Streams & Redirection
- Programs generally behave as transformations: an input stream → an output stream.
- Redirection = changing where those streams come from / go to.
- \text{stdin} \;(fd = 0) can be attached to a file instead of keyboard.
- \text{stdout} \;(fd = 1) can be written to a file instead of terminal.
- Pipes are a specialised, dynamic form of redirection between two live processes rather than between a process and a file.
Pipes in Bash – User-Level View
- Syntax:
prog1 | prog2
→ output of prog1
becomes input of prog2
. - Typical example:
ls | wc -l
⇒ prints the number of directory entries. - Shell handles the low-level plumbing transparently; user only sees the
|
token.
Pipes Under the Hood – Conceptual Model
- Need a mechanism allowing concurrent producer & consumer processes.
- Naïve shared-memory idea is unsafe (racing, blocking, address-space isolation).
- Preferred solution: an OS-managed buffer (queue) that both processes access via file descriptors.
- Producer blocks when buffer full.
- Consumer blocks when buffer empty.
- Behaviour analogous to a bounded FIFO queue.
- ASCII metaphor in lecture:
Process 1 → [ PIPE (buffer) ] → Process 2
Pipes in C – File Descriptors & Buffer Mechanics
- Creation:
int fd[2]; pipe(fd);
- fd[0] = read end.
- fd[1] = write end.
- A pipe is therefore two file descriptors referencing the same kernel buffer.
- Goal: make each child think that the appropriate end is its normal stdin/stdout.
Reader–Writer Protocol (dup2, close, exec)
- WRITER process pattern:
close(fd[0]);
// never readsdup2(fd[1], 1);
// stdout \Leftarrow fd[1]close(fd[1]);
// fd[1] now redundantexecvp("producer", …);
- READER process pattern:
close(fd[1]);
// never writesdup2(fd[0], 0);
// stdin \Leftarrow fd[0]close(fd[0]);
execvp("consumer", …);
- Why the extra
close
calls?- Any unused end kept open fools the kernel into thinking a reader/writer still exists → potential indefinite blocking.
dup2(oldFD,newFD)
duplicates the object referenced by oldFD
onto descriptor newFD
, closing newFD
first if open.
Putting It Together with fork() – Minimal Example
int main() {
char *child_cmd[] = {"ls", NULL};
char *parent_cmd[] = {"wc", NULL};
int my_pipe[2];
if (pipe(my_pipe) == -1) perror("Cannot create pipe\n");
pid_t pid = fork();
if (pid < 0) perror("Failed fork\n");
if (pid > 0) { // PARENT – runs wc
close(my_pipe[1]);
dup2(my_pipe[0], STDIN_FILENO);
close(my_pipe[0]);
wait(NULL);
execvp("wc", parent_cmd);
} else { // CHILD – runs ls
close(my_pipe[0]);
dup2(my_pipe[1], STDOUT_FILENO);
close(my_pipe[1]);
execvp("ls", child_cmd);
}
}
Detailed Walk-Through of Example Code
- Setup: two null-terminated command arrays for
ls
and wc
. - pipe() system call fills
my_pipe[2]
; returns -1 on error. - fork() duplicates the current process; child receives pid = 0, parent gets child PID > 0.
- Parent path (consumer):
- Closes write end fd[1].
- Redirects stdin (
STDIN_FILENO
= 0) to read end fd[0] via dup2
. - Closes now-useless fd[0].
wait(NULL)
blocks until child exits → prevents zombie; ensures orderly shutdown.execvp("wc", …)
overlays parent with wc
.
- Child path (producer): symmetrical operations then
execvp("ls", …)
. - Result mimics shell behaviour:
ls | wc
.
Operational Limits, Errors & Edge Cases
- Capacity: POSIX-defined minimum buffer size (often 4096 or 65536 bytes); implementation dependent.
- Chunking: send large data in smaller writes to avoid blockage.
- Error handling:
- Writer with no active reader →
write()
returns -1, errno == EPIPE
; SIGPIPE may be delivered. - Reader with no writer →
read()
blocks until all write ends closed; then returns 0 (EOF).
- Multiple readers:
- Pipe is FIFO; bytes are distributed in arrival order but scheduling vagaries can yield unpredictable distribution among readers.
- Multiple writers likewise compete; atomicity is guaranteed only up to PIPE_BUF bytes (minimum 512) per POSIX.
Connections, Implications & Best Practices
- Builds on earlier lectures covering file descriptors and process control (fork/exec).
- Illustrates classic producer–consumer concurrency pattern.
- Shows how shell conveniences map to explicit OS calls – valuable for debugging & systems programming.
- Ethical / practical angle: misuse (e.g., unclosed descriptors) can cause hangs, resource leaks, or subtle race conditions.
- Real-world relevance: same pattern underpins networking sockets, filters in UNIX pipelines, and micro-service message queues.
- Best practices:
- Always
close()
unused ends immediately. - Check system-call return values; respond to
EPIPE
. - Use
SIGPIPE
handling or MSG_NOSIGNAL
where appropriate. - Keep writes ≤ PIPE_BUF for atomicity when multiple writers.