Interprocess Communication (IPC) and Pipes in C

Overview of Interprocess Communication and Pipes

  • Interprocess Communication (IPC): IPC refers to the mechanisms provided by an operating system to allow different processes to communicate and synchronize their actions. This specific discussion focuses on Pipes as a foundational IPC method.

  • Definition of a Pipe:

    • A pipe is a simple, synchronized method of passing information between processes.

    • It functions as a special file or buffer that stores a limited amount of data in a FIFO (First-In-First-Out) or sequential manner.

    • Pipes are typically used within shell environments to connect the standard output (stdout\text{stdout}) of one utility to the standard input (stdin\text{stdin}) of another.

  • Synchronization Principles:

    • Write Blocking: If a process attempts to write to a full pipe, it is blocked (paused) until the reader process consumes some of the data.

    • Read Blocking: If a process attempts to read from an empty pipe, it is blocked until the writer process produces data.

    • Unbuffered I/O: Data is written to and read from pipes using the unbuffered system calls write()\text{write()} and read()\text{read()}.

Types of Pipes

  • Unnamed Pipes:

    • These are not associated with any file in the file system.

    • They can only be used between related processes, such as a parent and its child process.

    • They exist only as long as the processes using them are active; they are destroyed upon process termination.

    • They typically support unidirectional communication.

    • Creation: Created using the pipe()\text{pipe()} system call.

  • Named Pipes (FIFOs):

    • These are associated with a file in the directory structure and have specific file access permissions.

    • They can be used for communication between unrelated processes.

    • They persist in the file system even after processes have terminated.

    • They can support bidirectional communication.

    • Creation: Created using the mknod()\text{mknod()} or mkfifo()\text{mkfifo()} system calls.

Named Pipe Creation in C

  • To create a named pipe within a C program, the following headers are required:

    • #include\#include

    • #include\#include

  • The mkfifo() Function:

    • Syntax example: \text{if (mkfifo("test_fifo", 0777)) { perror("mkfifo"); exit(1); }}

    • The permission code 0777\text{0777} grants full access (read, write, execute) to all users.

Unnamed Pipes: Mechanisms and File Descriptors

  • The pipe() System Call:

    • Generates an unnamed pipe and opens it for both reading and writing.

    • Syntax: int pipe(int fd[2]);\text{int pipe(int fd[2]);}

    • Return Values: Returns 00 on success and 1-1 if an error occurs.

    • File Descriptors: Upon success, the system populates the integer array with two file descriptors:

      • fd[0]\text{fd[0]}: Associated with the read end of the pipe.

      • fd[1]\text{fd[1]}: Associated with the write end of the pipe.

  • Standard File Descriptor Table Mapping:

    • 00: stdin\text{stdin}

    • 11: stdout\text{stdout}

    • 22: stderr\text{stderr}

    • 33: fd[0]\text{fd[0]} (Read end of the pipe)

    • 44: fd[1]\text{fd[1]} (Write end of the pipe)

The write() and read() System Calls

  • Writing to a Pipe:

    • #include\#include

    • Syntax: \text{ssize_t write(int fd, const void *buf, size_t count);}

    • It appends up to count\text{count} bytes from the buffer to the pipe referenced by the file descriptor.

    • Atomicity: Operations are guaranteed to be atomic for requests with a size typically around 4096\text{4096} bytes or less. This ensures the operation is isolated from other concurrent operations. The block size limit is defined by \text{PIPE_BUF} in /usr/include/linux/limits.h\text{/usr/include/linux/limits.h}.

    • Broken Pipe (SIGPIPE): If a process tries to write to a pipe that is not open for reading by any process, the kernel generates a SIGPIPE\text{SIGPIPE} signal and sets errno\text{errno} to EPIPE\text{EPIPE}.

  • Reading from a Pipe:

    • #include\#include

    • Syntax: \text{ssize_t read(int fd, void *buf, size_t count);}

    • Attempts to read count\text{count} bytes in a FIFO manner.

    • Return Values:

      • Returns the actual number of bytes read.

      • Returns 00 if the pipe is not open for writing by any process (EOF).

      • The process sleeps if the pipe is empty but still open for writing by another process.

Implementation Workflow for Unnamed Pipes

  • Connecting Processes: Because unnamed pipes have no name in the file system, they are shared through the file descriptor mechanism inherited during a fork()\text{fork()}.

  • Standard Sequence of Operations:

    1. The parent process creates the unnamed pipe using pipe()\text{pipe()}. This must be done before forking.

    2. The parent process calls fork()\text{fork()}.

    3. The writer process (could be parent or child) closes the read end (fd[0]\text{fd[0]}).

    4. The reader process closes the write end (fd[1]\text{fd[1]}).

    5. Communication occurs via write()\text{write()} and read()\text{read()}.

    6. Each process closes its remaining active pipe end when finished.

  • Example: Parent Sending Message to Child

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#define READ 0
#define WRITE 1

int main() {
    int fd[2];
    char message[] = "Hello from parent";
    char buffer[100];
    pipe(fd);
    if (fork() == 0) {
        // Child process
        close(fd[WRITE]); // Close unused write end
        read(fd[READ], buffer, sizeof(buffer));
        printf("Child received: %s\n", buffer);
        close(fd[READ]);
    } else {
        // Parent process
        close(fd[READ]); // Close unused read end
        write(fd[WRITE], message, strlen(message) + 1);
        close(fd[WRITE]);
    }
    return 0;
}

The dup() and dup2() System Calls

  • The dup() System Call:

    • Syntax: int dup(int oldfd);\text{int dup(int oldfd);}

    • Creates a copy of an existing file descriptor using the lowest available unused numerical file descriptor.

    • Both descriptors share the same open pipe/file, file pointer, and access modes (R/W).

    • The function fileno(FILE *fp)\text{fileno(FILE *fp)} can be used to convert a stream to a file descriptor for use with dup()\text{dup()}.

  • The dup2() System Call:

    • Syntax: int dup2(int oldfd, int newfd);\text{int dup2(int oldfd, int newfd);}

    • This is an atomic version of close(newfd)\text{close(newfd)} followed by dup(oldfd)\text{dup(oldfd)}.

    • It forces newfd\text{newfd} to refer to the same file as oldfd\text{oldfd}.

    • It is preferred because there is no time lapse between closing newfd\text{newfd} and duplicating oldfd\text{oldfd} into its slot, preventing race conditions.

Practical Applications: Redirection and Shell Pipes

  • Implementing Command-Line Pipes (e.g., ls -al | wc):

    • To connect ls\text{ls} to wc\text{wc}, the shell redirects the stdout\text{stdout} of the first command to the write end of a pipe, and the stdin\text{stdin} of the second command to the read end of that same pipe.

    • Code Example Logic:

      1. Create pipe fd\text{fd}.

      2. Fork child 1 (for ls\text{ls}): Call dup2(fd[WRITE], 1)\text{dup2(fd[WRITE], 1)}, close pipe descriptors, and execl()\text{execl()}.

      3. Fork child 2 or use parent (for wc\text{wc}): Call dup2(fd[READ], 0)\text{dup2(fd[READ], 0)}, close pipe descriptors, and execl()\text{execl()}.

  • Implementing Redirection (e.g., command > file):

    • 1. The parent shell forks\text{forks}.

    • 2. The child shell opens\text{opens} the destination file (using \text{O_CREAT | O_TRUNC | O_WRONLY}).

    • 3. The child calls dup2(fd, 1)\text{dup2(fd, 1)} to replace stdout\text{stdout} with the file descriptor.

    • 4. The child calls exec()\text{exec()} to run the command. Because file descriptors are inherited across exec()\text{exec()}, all stdout\text{stdout} goes to the file.

  • Complex Chains (e.g., sort < file2 | uniq):

    • This involves combining file redirection for the input of sort\text{sort} and a pipe to connect the output of sort\text{sort} to the input of uniq\text{uniq}.

Core Definitions

  • Kernel: The central part of the operating system; a moderator between applications and hardware resources.

  • System Calls: A set of functions provided by the kernel to allow programs to request resources, start new programs, or communicate.

  • API (Application Programming Interface): A collective term for the set of system calls available for programming interaction with the OS.