Unix Philosophy – “Everything is a File”
- Core design choice in early Unix: treat most system resources as files addressed only by a file descriptor (FD)
- Standard streams: stdin (0), stdout (1), stderr (2)
- Ordinary on-disk files
- Pipes (anonymous or named) acting as byte buffers between related processes
- Benefits
- Uniform, minimal API (
read
, write
, close
, dup2
, …) regardless of the “thing” behind the FD - Easy redirection/composition in shells (|, >, <)
Connecting Independent Processes
- Naïve method: intermediary on-disk temporary file
- One process writes, the other reads
- Drawbacks
- Disk I/O latency (slow)
- No built-in synchronisation ⇒ races: Is writer done? Is reader still reading?
- Pipes (
pipe()
+ fork()
)- Fast in-memory buffer
- Require parent/child relationship; cannot connect two arbitrary programs later in life
- Only work inside one process tree (single machine)
fopen
, fdopen
, popen
, pclose
FILE * fopen(const char *path, const char *mode)
- Opens/creates file, returns high-level C stream
FILE * fdopen(int fd, const char *mode)
- Wraps an existing FD inside a
FILE *
stream
FILE * popen(const char *command, const char *type)
- Creates a process instead of a file
- Under the hood performs
pipe()
→ builds the data channelfork()
→ child inherits ends of pipeexecl("/bin/sh", "sh", "-c", command, NULL)
- Modes
"w"
⇒ caller writes → child’s stdin; child’s stdout/stderr unchanged"r"
⇒ caller reads ← child’s stdout; child’s stdin inherited from parent (adjustable)
int pclose(FILE *stream)
waits for child & returns its exit status
Interfaces & Encapsulation
- Analogy to C++ function interface: identical signature, different implementation
// Loop implementation
int sum(int a, int b){
int total = 0;
for(int i=a;i<b;i++) total += i;
return total;
}
// Closed-form implementation
int sum(int a, int b){
return (a + b)/2 * (b - a + 1); // → \frac{(a+b)}{2}\times (b-a+1)
}
- In IPC we need an agreed protocol (interface) so either side can evolve independently as long as it obeys the contract
- Example: client never needs to care whether data came from live computation or cached file
Sockets – High-Level Idea
- Generalise pipe concept to two totally independent processes (can be on different computers)
- Acts like a bidirectional byte stream; still accessed by FD ⇒
read
/ write
semantics hold - Only extra requirement: both ends must speak the same application-level protocol (HTTP, FTP, custom, …)
- High flexibility: connect, disconnect, reconnect at runtime; no parent/child requirement
Typical Uses
- Web servers / browsers (HTTP)
- FTP clients/servers
- Any networked multiplayer or distributed application
Client-Server Paradigm
- One side (server) = long-lived “receptionist” waiting (
listen
) on a known address + port - Other side (client) initiates (
connect
) when it wants service - After connection established, roles can blur; both sides may read/write arbitrarily
- Real-world metaphor: historical telephone “Time” service — server continuously answered calls with current time
Addressing: IP & Port
- IP address ⇒ where (host)
- Unique per network interface; many processes share it
- Port number ⇒ which service (application)
- 0–65535, per-host namespace
- Multiple hosts can use same port (e.g.
80 for HTTP) without conflict
Handshake Sequence (TCP Stream)
- Server
socket()
→ obtain FDbind()
→ associate FD with local port (and optionally IP)listen()
→ kernel moves FD to passive state; queue length = backlog
- Client (after server ready)
socket()
connect()
→ specify remote IP + port (server’s)
- Server
accept()
→ removes one pending request from queue, returns new FD dedicated to that client
- Both sides read/write freely; original listening FD keeps waiting for more clients
SERVER CLIENT
socket → bind → listen socket
^ |
| (blocks) connect()
accept() ———————————————→ (3-way handshake in TCP)
↕ dedicated FD ↕
read/write … read/write …
Low-Level API (Server Side)
- Headers (POSIX / BSD sockets)
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
int sockfd = socket(PF_INET, SOCK_STREAM, 0);
PF_INET
⇒ IPv4 protocol familySOCK_STREAM
⇒ reliable byte stream (TCP)SOCK_DGRAM
⇒ datagrams (UDP)
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(PORT);
// host → network byte orderaddr.sin_addr = …;
via gethostbyname()
or inet_aton()
bind(sockfd, (struct sockaddr*)&addr, sizeof(addr));
listen(sockfd, backlog);
// e.g. backlog = 128
int clientfd = accept(sockfd, NULL, NULL);
- Returns new FD dedicated to that connection
Low-Level API (Client Side)
- Same headers
- Build
sockaddr_in server;
(set IP+port) int fd = socket(PF_INET, SOCK_STREAM, 0);
connect(fd, (struct sockaddr*)&server, sizeof(server));
- Data transfer
ssize_t n = read(fd, buffer, sizeof(buffer));
write(fd, msg, strlen(msg));
close(fd);
Dealing with Many Clients – Fork + Dup Example
- Goal: decouple receptionist (accept loop) from worker (actual job)
- Pattern
void process_request(int fd){
int pid = fork();
if(pid < 0) return; // fork failed
if(pid > 0){ // parent = server
wait(NULL); // optional: reap child
return; // back to accept loop
}
// child becomes worker
dup2(fd, 1); // redirect stdout to socket
close(fd);
execl("date", "date", NULL); // run arbitrary program
_exit(1); // exec failed
}
- After
dup2
, the client reads directly from the spawned program’s stdout
Designing a Real Web Server (Roadmap)
- Repeat: socket → bind → listen (server) & socket → connect (client)
- After
accept
:- Fork/thread or event-driven hand-off to worker pool
- Parse request according to protocol (e.g. HTTP/1.1 headers & body)
- Generate response (file content, dynamic CGI, etc.)
- Close client FD
- In higher-level languages (Python, Go, Rust…) libraries/frameworks abstract most of the boilerplate; in C you manually manage parsing, buffers, byte-ordering, error handling
Cheat-Sheet: Key Functions & Macros
socket(domain, type, proto)
bind(fd, struct sockaddr*, len)
listen(fd, backlog)
accept(fd, struct sockaddr*, socklen_t*)
connect(fd, struct sockaddr*, len)
read / write / send / recv
(blocking) / select
or poll
(non-blocking multiplex)htons
, htonl
, ntohs
, ntohl
– host/network byte order conversion (endianness)PF_INET
, AF_INET
, SOCK_STREAM
, SOCK_DGRAM
- Always check return values (
-1
) and errno
- Set
SO_REUSEADDR
with setsockopt
to re-bind quickly after crash - Consider non-blocking I/O +
select
/epoll
for high-concurrency servers instead of fork per connection - Backlog too low ⇒ dropped connections under high load
- Understand TCP vs UDP trade-offs (reliability vs speed, ordering, connection overhead)
Glossary
- FD (File Descriptor): integer handle returned by kernel (
0…
) for any open file-like object - Pipe: unidirectional in-memory buffer connecting related processes
- Socket: endpoint for bidirectional communication between processes – may traverse networks
- Server: waits for incoming connections on known port
- Client: initiates connection to server
- Backlog: max pending connection requests queue length in kernel before
accept
sockaddr_in
: IPv4-specific address struct (family, port, address)htons
/ ntohs
: convert 16-bit values between host and network byte order