Notes on Protocols: SMTP, POP, IMAP, PGP, HTTP, FTP
SMTP (Simple Mail Transfer Protocol)
SMTP is the Internet standard for electronic mail transmission across IP networks. It is a connection-oriented, text-based protocol in which a mail sender communicates with a mail receiver by issuing command strings and supplying the necessary data over a reliable ordered data stream channel, typically a TCP connection. SMTP uses TCP port 25 and operates by establishing an SMTP session consisting of commands originated by an SMTP client and corresponding responses from the SMTP server so that the session is opened and session parameters are exchanged. An SMTP transaction consists of three command/reply sequences:
1) MAIL command, to establish the return address, i.e., Return-Path, 5321.From or envelope sender. This is the address used for bounce messages.
2) RCPT command, to establish a recipient of this message. This command can be issued multiple times, one for each recipient. These addresses are also part of the envelope.
3) DATA to send the message text. This is the content of the message, as opposed to its envelope. It consists of a message header and a message body separated by an empty line.
SMTP is a delivery protocol only; it cannot pull messages from a remote server on demand. Other protocols such as POP and IMAP are specifically designed for retrieving messages and managing mailboxes. SMTP has a feature to initiate mail queue processing on a remote server so that the requesting system may receive any messages destined for it. POP and IMAP are preferred when a user’s personal computer is intermittently powered or Internet connectivity is transient, so hosts may not receive messages during offline periods.
Limitations and extensions
SMTP does not include authentication, which can lead to spam. It transfers ASCII messages, not binary data, which necessitated the base64 MIME content transfer encoding; however, encoding can be bandwidth-inefficient for large messages. SMTP sends messages in the clear, offering no encryption for privacy against eavesdroppers. To address these problems, SMTP was revised with an extension mechanism; using SMTP with extensions is called ESMTP (Extended SMTP).
Flow and placement
A typical flow involves a Sender’s Mail Server sending mail to the Receiver’s Mail Server, with POP3/IMAP used later by the recipient’s client to retrieve messages. The diagrammatic idea is that SMTP handles delivery between servers over the Internet, while POP/IMAP handle retrieval by end-user clients.
POP (Post Office Protocol)
POP (specifically POP3 in its current standard) is an application-layer Internet standard protocol used by local e-mail clients to retrieve mail from a remote server over a TCP/IP connection. POP and IMAP are the two most prevalent Internet standard protocols for e-mail retrieval, and virtually all modern e-mail clients and servers support both. POP3 has evolved through several versions, with POP3 being the current standard. Many webmail providers (e.g., Hotmail, Gmail, Yahoo! Mail) offer IMAP and POP3 services.
The Post Office Protocol defines how an email client should talk to the POP server. Through POP, a client can retrieve mail from an ISP and delete it on the server, retrieve mail but not delete it on the server, ask whether new mail has arrived without retrieving it, or peek at a few lines of a message to decide whether it is worth retrieving.
Usually, the POP server listens on port 110 for incoming connections. Upon connection from a POP client, the server responds with an "+OK" if everything is fine. A negative response is "-ERR" indicating something went wrong. The POP model emphasizes downloading mail to a single device and often deleting it from the server, which contrasts with IMAP’s multi-device, server-kept approach.
IMAP (Internet Message Access Protocol)
IMAP stands for Internet Message Access Protocol. It is a method of accessing electronic mail or bulletin board messages stored on a (possibly shared) mail server. IMAP allows a client email program to access remote message stores as if they were local, enabling access from multiple devices (home, office, travel) without transferring messages back and forth. IMAP has become particularly important as reliance on electronic messaging has grown, since it supports access from more than one computer and enables online, offline, and disconnected modes.
Key goals for IMAP include:
Be fully compatible with Internet messaging standards, e.g., MIME.
Allow message access and management from more than one computer.
Allow access without reliance on less efficient file access protocols.
Provide support for online, offline, and disconnected access modes.
Support concurrent access to shared mailboxes.
Client software needs no knowledge about the server’s file store format.
IMAP supports both online and offline operation. Email clients using IMAP typically leave messages on the server until the user explicitly deletes them, enabling multiple clients to manage the same mailbox. In contrast, POP clients connect briefly to download new messages and may delete them from the server, whereas IMAP4 often keeps messages on the server and downloads content on demand. This usage pattern can yield faster responses for users with many or large messages when using IMAP.
PGP (Pretty Good Privacy)
Pretty Good Privacy (PGP) is a data encryption and decryption program that provides cryptographic privacy and authentication for data communication. PGP is often used for signing, encrypting, and decrypting texts, e-mails, files, directories, and even whole disk partitions to increase e-mail security. It is a public key encryption package designed to protect e-mail and data files and is well-featured and fast, with sophisticated key management, digital signatures, data compression, and a practical design.
The operation of PGP is based on five services: authentication, confidentiality, compression, e-mail compatibility, and segmentation. Authentication is provided via a digital signature; confidentiality is achieved by encrypting messages before transmission using RSA schemes; compression occurs after applying the digital signature and before encryption to save space; PGP encrypts a message together with its signature into a stream of arbitrary 8-bit octets; and to accommodate e-mail size restrictions, PGP automatically segments messages that are too long.
HTTP (HyperText Transfer Protocol)
HTTP is the HyperText Transfer Protocol and is used to exchange or transfer hypertext, serving as the foundation of the World Wide Web. It defines how messages are formatted and transmitted and what actions Web servers and browsers should take in response to various commands. HTTP is the framework for how browsers display and use file formats. When you enter a URL beginning with HTTP, you are requesting a web page that can contain other elements (such as images and links to other resources). HTTP uses TCP port 80 by default, though ports such as 8080 can also be used.
There are three important characteristics of HTTP: it is connectionless (the client disconnects after a request and the server re-establishes the connection to process the request), it is media independent (any data type can be sent as long as the client and server know how to handle it, with MIME guiding content handling), and it is stateless (the server and client only know each other during a request and forget afterward).
A common diagrammatic view places HTTP as an application-layer protocol built atop TCP. HTTP clients (like web browsers) and servers exchange request and response messages. The three main HTTP request methods are GET, POST, and HEAD. The request/response cycle forms the core of how web content is retrieved and interacted with.
HTTP request methods
HTTP defines several request methods that indicate the desired action on a resource. The three primary ones are:
HEAD: asks for a response identical to a GET request but without the response body. This is useful for retrieving metadata from headers without transferring the entire content.
GET: requests a representation of the specified resource; GET should be side-effect-free and used for data retrieval only.
POST: submits data to be processed (e.g., from an HTML form) to the identified resource.
There are additional methods beyond these three, including PUT, DELETE, TRACE, OPTIONS, CONNECT, and PATCH, which serve purposes such as uploading a representation of a resource, deleting a resource, echoing requests for debugging, listing supported methods, establishing tunnels for secure communications, and applying partial updates, respectively.
FTP (File Transfer Protocol)
File Transfer Protocol (FTP) provides a method for transferring files over a network from one computer to another. Its primary use is uploading files to a website. It can also be used for downloading from the Web, though downloading is more commonly performed via HTTP. Sites with substantial downloading traffic (e.g., software repositories) often operate FTP servers to handle the load. If FTP is involved, the URL will begin with ftp:. FTP serves as a straightforward mechanism for moving files between client and server across a network.