knowt logo

Chapter 8: Computer Networks

8

NETWORK SECURITY

For the first few decades of their existence, computer networks were primarily used by university researchers for sending email and by corporate employees for sharing printers. Under these conditions, security did not get a lot of attention. But now, as millions of ordinary citizens are using networks for banking, shopping, and filing their tax returns, and weakness after weakness has been found, network security has become a problem of massive proportions. In this chapter, we will study network security from several angles, point out numerous pitfalls, and dis- cuss many algorithms and protocols for making networks more secure.

On a historical note, network hacking already existed long before there was an Internet. Instead, the telephone network was the target and messing around with the signaling protocol was known as phone phreaking. Phone phreaking started in the late 1950s, and really took off in the 1960s and 1970s. In those days, the

control signals used to authorize and route calls, were still ‘‘in band’’: the phone company used sounds at specific frequencies in the same channel as the voice com- munication to tell the switches what to do.

One of the best-known phone phreakers is John Draper, a controversial figure who found that the toy whistle included in the boxes of Cap’n Crunch cereals in the late 1960s emitted a tone of exactly 2600 Hz which happened to be the fre- quency that AT&T used to authorize long-distance calls. Using the whistle, Draper was able to make long distance calls for free. Draper became known as Captain Crunch and used the whistles to build so-called blue boxes to hack the telephone

731

732 NETWORK SECURITY CHAP. 8

system. In 1974, Draper was arrested for toll fraud and went to jail, but not before he had inspired two other pioneers in the Bay area, Steve Wozniak and Steve Jobs, to also engage in phone phreaking and build their own blue boxes, as well as, at a later stage, a computer that they decided to call Apple. According to Wozniak, there would have been no Apple without Captain Crunch.

Security is a broad topic and covers a multitude of sins. In its simplest form, it is concerned with making sure that nosy people cannot read, or worse yet, secretly modify messages intended for other recipients. It is also concerned with attackers who try to subvert essential network services such as BGP or DNS, render links or network services unavailable, or access remote services that they are not authorized to use. Another topic of interest is how to tell whether that message purportedly from the IRS ‘‘Pay by Friday, or else’’ is really from the IRS and not from the Mafia. Security additionally deals with the problems of legitimate messages being captured and replayed, and with people later trying to deny that they sent certain messages.

Most security problems are intentionally caused by malicious people trying to gain some benefit, get attention, or harm someone. A few of the most common perpetrators are listed in Fig. 8-1. It should be clear from this list that making a network secure involves a lot more than just keeping it free of programming errors. It involves outsmarting often intelligent, dedicated, and sometimes well-funded adversaries. Measures that will thwart casual attackers will have little impact on the serious ones.

In an article in USENIX ;Login:, James Mickens of Microsoft (and now a pro fessor at Harvard University) argued that you should distinguish between everyday attackers and, say, sophisticated intelligence services. If you are worried about garden-variety adversaries, you will be fine with common sense and basic security measures. Mickens eloquently explains the distinction:

‘‘If your adversary is the Mossad, you’re gonna die and there’s nothing that you can do about it. The Mossad is not intimidated by the fact that you employ https://. If the Mossad wants your data, they’re going to use a drone to replace your cell- phone with a piece of uranium that’s shaped like a cellphone, and when you die of tumors filled with tumors, they’re going to hold a press conference and say ‘‘It wasn’t us’’ as they wear t-shirts that say ‘‘IT WAS DEFINITELY US’’ and then they’re going to buy all of your stuf at your estate sale so that they can directly look at the photos of your vacation instead of reading your insipid emails about them.’’

Mickens’ point is that sophisticated attackers have advanced means to compro- mise your systems and stopping them is very hard. In addition, police records show that the most damaging attacks are often perpetrated by insiders bearing a grudge. Security systems should be designed accordingly.

Sec 8.1 FUNDAMENTALS OF NETWORK SECURITY 733

Adversary Goal

Student To have fun snooping on people’s email

Cracker To test someone’s security system; steal data

Sales rep To claim to represent all of Europe, not just Andorra

Corporation To discover a competitor’s strategic marketing plan

Ex-employee To get revenge for being fired

Accountant To embezzle money from a company

Stockbroker To deny a promise made to a customer by email

Identity thief To steal credit card numbers for sale

Government To learn an enemy’s military or industrial secrets

Terrorist To steal biological warfare secrets

Figure 8-1. Some people who may cause security problems, and why.

8.1 FUNDAMENTALS OF NETWORK SECURITY

The classic way to deal with network security problems is to distinguish three essential security properties: confidentiality, integrity, and availability. The com- mon abbreviation, CIA, is perhaps a bit unfortunate, given that the other common expansion of that acronym has not been shy in violating those properties in the past. Confidentiality has to do with keeping information out of the grubby little hands of unauthorized users. This is what often comes to mind when people think about network security. Integrity is all about ensuring that the information you re- ceived was really the information sent and not something that an adversary modi fied. Availability deals with preventing systems and services from becoming un- usable due to crashes, overload situations, or deliberate misconfigurations. Good examples of attempts to compromise availability are the denial-of-service attacks that frequently wreak havoc on high-value targets such as banks, airlines and the local high school during exam time. In addition to the classic triumvirate of confi- dentiality, integrity, and availability that dominates the security domain, there are other issues that play important roles also. In particular, authentication deals with determining whom you are talking to before revealing sensitive information or entering into a business deal. Finally, nonrepudiation deals with signatures: how do you prove that your customer really placed an electronic order for 10 million left-handed doohickeys at 89 cents each when he later claims the price was 69 cents? Or maybe he claims he never placed any order after seeing that a Chinese firm is flooding the market with left-handed doohickeys for 49 cents.

All these issues occur in traditional systems, too, but with some significant differences. Integrity and secrecy are achieved by using registered mail and lock ing documents up. Robbing the mail train is harder now than it was in Jesse James’ day. Also, people can usually tell the difference between an original paper

734 NETWORK SECURITY CHAP. 8

document and a photocopy, and it often matters to them. As a test, make a photo- copy of a valid check. Try cashing the original check at your bank on Monday. Now try cashing the photocopy of the check on Tuesday. Observe the difference in the bank’s behavior.

As for authentication, people authenticate other people by various means, in- cluding recognizing their faces, voices, and handwriting. Proof of signing is hand led by signatures on letterhead paper, raised seals, and so on. Tampering can usually be detected by handwriting, ink, and paper experts. None of these options are available electronically. Clearly, other solutions are needed.

Before getting into the solutions themselves, it is worth spending a few moments considering where in the protocol stack network security belongs. There is probably no one single place. Every layer has something to contribute. In the physical layer, wiretapping can be foiled by enclosing transmission lines (or better yet, optical fibers) in sealed metal tubes containing an inert gas at high pressure. Any attempt to drill into a tube will release some gas, reducing the pressure and triggering an alarm. Some military systems use this technique.

In the data link layer, packets on a point-to-point link can be encrypted as they leave one machine and decrypted as they enter another. All the details can be handled in the data link layer, with higher layers oblivious to what is going on. This solution breaks down when packets have to traverse multiple routers, howev- er, because packets have to be decrypted at each router, leaving them vulnerable to attacks from within the router. Also, it does not allow some sessions to be protect- ed (e.g., those involving online purchases by credit card) and others not. Neverthe less, link encryption, as this method is called, can be added to any network easily and is often useful.

In the network layer, firewalls can be deployed to prevent attack traffic from entering or leaving networks. IPsec, a protocol for IP security that encrypts packet payloads, also functions at this layer. At the transport layer, entire connections can be encrypted end-to-end, that is, process to process. Problems such as user authentication and nonrepudiation are often handled at the application layer, although occasionally (e.g., in the case of wireless networks), user authentication can take place at lower layers. Since security applies to all layers of the network protocol stack, we dedicate an entire chapter of the book to this topic.

8.1.1 Fundamental Security Principles

While addressing security concerns in all layers of the network stack is cer tainly necessary, it is very difficult to determine when you have addressed them sufficiently and if you have addressed them all. In other words, guaranteeing secu rity is hard. Instead, we try to improve security as much as we can by consistently applying a set of security principles. Classic security principles were formulated as early as 1975 by Jerome Saltzer and Michael Schroeder:

SEC. 8.1 FUNDAMENTALS OF NETWORK SECURITY 735

1. Principle of economy of mechanism. This principle is sometimes paraphrased as the principle of simplicity. Complex systems tend to have more bugs than simple systems. Moreover, users may not under- stand them well and use them in a wrong or insecure way. Simple sys tems are good systems. For instance, PGP (Pretty Good Privacy, see Sec. 8.11), offers powerful protection for email. However, many users find it cumbersome in practice and so far it has not yet gained very widespread adoption. Simplicity also helps to minimize the attack surface (all the points where an attacker may interact with the system to try to compromise it). A system that offers a large set of functions to untrusted users, each implemented by many lines of code, has a large attack surface. If a function is not really needed, leave it out.

2. Principle of fail-safe defaults. Say you need to organize the access to a resource. It is better to make explicit rules about when one can access the resource than trying to identify the condition under which access to the resource should be denied. Phrased differently: a default of lack of permission is safer.

3. Principle of complete mediation. Every access to every resource should be checked for authority. It implies that we must have a way to determine the source of a request (the requester).

4. Principle of least authority. This principle, often known as POLA, states that any (sub) system should have just enough authority (privi lege) to perform its task and no more. Thus, if attackers compromise such a system, they elevate their privilege by only the bare minimum.

5. Principle of privilege separation. Closely related to the previous point: it is better to split up the system into multiple POLA-compliant components than a single component with all the privileges combin- ed. Again, if one component is compromised, the attackers will be limited in what they can do.

6. Principle of least common mechanism. This principle is a little trickier and states that we should minimize the amount of mechanism common to more than one user and depended on by all users. Think of it this way: if we have a choice between implementing a network routine in the operating system where its global variables are shared by all users, or in a user space library which, to all intents and pur- poses, is private to the user process, we should opt for the latter. The shared data in the operating system may well serve as an information path between different users. We shall see an example of this in the section on TCP connection hijacking.

736 NETWORK SECURITY CHAP. 8

7. Principle of open design. This states plain and simple that the de- sign should not be secret and generalizes what is known as Kerck- hoffs’ principle in cryptography. In 1883, the Dutch-born Auguste Kerckhoffs published two journal articles on military cryptography which stated that a cryptosystem should be secure even if everything about the system, except the key, is public knowledge. In other words, do not rely on ‘‘security by obscurity,’’ but assume that the adversary immediately gains familiarity with your system and knows the en- cryption and decryption algorithms.

8. Principle of psychological acceptability. The final principle is not a technical one at all. Security rules and mechanisms should be easy to use and understand. Again, many implementations of PGP protection for email fail this principle. However, acceptability entails more. Besides the usability of the mechanism, it should also be clear why the rules and mechanisms are necessary in the first place.

An important factor in ensuring security is also the concept of isolation. Isola tion guarantees the separation of components (programs, computer systems, or even entire networks) that belong to different security domains or have different

privileges. All interaction that takes place between the different components is mediated with proper privilege checks. Isolation, POLA, and a tight control of the flow of information between components allow the design of strongly compart- mentalized systems.

Network security comprises concerns in the domain of systems and engineer ing as well as concerns rooted in theory, math, and cryptography. A good example of the former is the classic ping of death, which allowed attackers to crash hosts all over the Internet by using fragmentation options in IP to craft ICMP echo re- quest packets larger than the maximum allowed IP packet size. Since the receiving side never expected such large packets, it reserved insufficient buffer memory for all the data and the excess bytes would overwrite other data that followed the buff- er in memory. Clearly, this was a bug, commonly known as a buffer overflow. An example of a cryptography problem is the 40-bit key used in the original WEP en- cryption for WiFi networks which could be easily brute-forced by attackers with sufficient computational power.

8.1.2 Fundamental Attack Principles

The easiest way to structure a discussion about systems aspects of security is to put ourselves in the shoes of the adversary. So, having introduced fundamental as- pects of security above, let us now consider the fundamentals of attacks.

From an attacker perspective, the security of a system presents itself as a set of challenges that attackers must solve to reach their objectives. There are multiple ways to violate confidentiality, integrity, availability, or any of the other security

SEC. 8.1 FUNDAMENTALS OF NETWORK SECURITY 737

properties. For instance, to break confidentiality of network traffic, an attacker may break into a system to read the data directly, trick the communicating parties to send data without encryption and capture it, or, in a more ambitious scenario, break the encryption. All of these are used in practice and all of them consist of multiple

steps. We will deep dive into the fundamentals of attacks in Sec. 8.2. As a preview, let us consider the various steps and approaches attackers may use.

1. Reconnaissance. Alexander Graham Bell once said: ‘‘Preparation is the key to success.’’ and thus it is for attackers also. The first thing you do as an attacker is to get to know as much about your target as you can. In case you plan to attack by means of spam or social engin- eering, you may want to spend some time sifting through the online profiles of the people you want to trick into giving up information, or even engage in some old-fashioned dumpster diving. In this chapter, however, we limit ourselves to technical aspects of attacks and defenses. Reconnaissance in network security is about discovering information that helps the attacker. Which machines can we reach from the outside? Using which protocols? What is the topology of the network? What services run on which machines? Et cetera. We will discuss reconnaissance in Sec. 8.2.1

2. Sniffing and Snooping. An important step in many network attacks concerns the interception of network packets. Certainly if sensitive information is sent ‘‘in the clear’’ (without encryption), the ability to intercept network traffic is very useful for the attacker, but even en- crypted traffic can be useful—to find out the MAC addresses of com- municating parties, who talks to whom and when, etc. Moreover, an attacker needs to intercept the encrypted traffic to break the en- cryption. Since an attacker has access to other people’s network traf fic, the ability to sniff indicates that at least the principles of least au thority and complete mediation are not sufficiently enforced. Sniffing is easy on a broadcast medium such as WiFi, but how to intercept traffic if it does not even travel over the link to which your computer is connected? Sniffing is the topic of Sec. 8.2.2.

3. Spoofing. Another basic weapon in the hands of attackers is mas- querading as someone else. Spoofed network traffic pretends to origi- nate from some other machine. For instance, we can easily transmit an Ethernet frame or IP packet with a different source address, as a means to bypass a defense or launch denial-of-service attacks, be- cause these protocols are very simple. However, can we also do so for complicated protocols such as TCP? After all, if you send a TCP SYN segment to set up a connection to a server with a spoofed IP ad- dress, the server will reply with its SYN/ACk segment (the second phase of the connection setup) to that IP address, so unless the

738 NETWORK SECURITY CHAP. 8

attackers are on the same network segment, they will not see the re- ply. Without that reply, they will not know the sequence number used by the server, and hence, they will not be able to communicate. Spoofing circumvents the principle of complete mediation: if we can- not determine who sent a request, we cannot properly mediate it. In Sec. 8.2.3, we discuss spoofing in detail.

4. Disruption. The third component of our CIA triad, availability, has grown in importance also for attackers, with devastating DoS (Denial of Service) attacks on all sorts of organizations. Moreover, in re- sponse to new defenses, these attacks have grown ever more sophisti- cated. One can argue that DoS attacks abuse the fact that the prin- ciple of least common mechanism is not rigorously enforced—there is insufficient isolation. In Sec. 8.2.4, we will look at the evolution of such attacks.

Using these fundamental building blocks, attackers can craft a wide range of attacks. For instance, using reconnaissance and sniffing, attackers may find the ad- dress of a potential victim computer and discover that it trusts a server so that any request coming from that server is automatically accepted. By means of a denial-of-service (disruption) attack they can bring down the real server to make sure it does not respond to the victim any more and then send spoofed requests that appear to originate from the server. In fact, this is exactly how one of the most famous attacks in the history of the Internet (on the San Diego Supercomputer Center) happened. We will discuss the attack later.

8.1.3 From Threats to Solutions

After discussing the attacker’s moves, we will consider what we can do about them. Since most attacks arrive over the network, the security community quickly realized that the network may also be a good place to monitor for attacks. In Sec. 8.3, we will look at firewalls, intrusion detection systems and similar defenses.

Where Secs. 8.2 and 8.3 address the systems-related issues of attackers getting their grubby little hands on sensitive information or systems, we devote Secs. 8.4–8.9 to the more formal aspects of network security, when we discuss cryptog- raphy and authentication. Rooted in mathematics and implemented in computer systems, a variety of cryptographic primitives help ensure that even if network traf fic falls in the wrong hands, nothing too bad can happen. For instance, attackers will still not be able to break confidentiality, tamper with the content, or suc- cessfully replay a network conversation. There is a lot to say about cryptography, as there are different types of primitives for different purposes (proving authentic ity, encryption using public keys, encryption using symmetric keys, etc.) and each type tends to have different implementations. In Sec. 8.4, we introduce the key concepts of cryptography, and Sections 8.5 and 8.6 discuss symmetric and public

SEC. 8.1 FUNDAMENTALS OF NETWORK SECURITY 739

key cryptography, respectively. We explore digital signatures in Sec. 8.7 and key management in Sec. 8.8.

Sec. 8.9 discusses the fundamental problem of secure authentication . Authentication is that which prevents spoofing altogether: the technique by which a process verifies that its communication partner is who it is supposed to be and not an imposter. As security became increasingly important, the community devel- oped a variety of authentication protocols. As we shall see, they tend to build on cryptography.

In the sections following authentication, we survey concrete examples of (often crypto-based) network security solutions. In Sec. 8.10, we discuss network tech- nologies that provide communication security, such as IPsec, VPNs, and Wireless security. Section 8.11 looks at the problem of email security, including explana tions of PGP (Pretty Good Privacy) and S/MIME (Secure Multipurpose Internet Mail Extension). Section 8.12 discusses security in the wider Web domain, with descriptions of secure DNS (DNSSEC), scripting code that runs in browsers, and the Secure Sockets Layer (SSL). As we shall see, these technologies use many of the ideas discussed in the preceding sections.

Finally, we discuss social issues in Sec. 8.13. What are the implications for im- portant rights, such as privacy and freedom of speech? What about copyright and protection of intellectual property? Security is an important topic so looking at it closely is worthwhile.

Before diving in, we should reiterate that security is an entire field of study in its own right. In this chapter, we focus only on networks and communication, rath- er than issues related to hardware, operating systems, applications, or users. This means that we will not spend much time looking at bugs and there is nothing here about user authentication using biometrics, password security, buffer overflow at tacks, Trojan horses, login spoofing, process isolation, or viruses. All of these top ics are covered at length in Chap. 9 of Modern Operating Systems (Tanenbaum and Bos, 2015). The interested reader is referred to that book for the systems aspects of security. Now let us begin our journey.

8.2 THE CORE INGREDIENTS OF AN ATTACK

As a first step, let us consider the fundamental ingredients that make up an at tack. Virtually all network attacks follow a recipe that mixes some variants of these ingredients in a clever manner.

8.2.1 Reconnaissance

Say you are an attacker and one fine morning you decide that you will hack or- ganization X, where do you start? You do not have much information about the or- ganization and, physically, you are an Internet away from the nearest office, so

740 NETWORK SECURITY CHAP. 8

dumpster diving or shoulder surfing are not options. You can always use social engineering, to try and extract sensitive information from employees by sending them emails (spam), or phoning them, or befriending them on social networks, but in this book, we are interested in more technical issues, related to computer net- works. For instance, can you find out what computers exist in the organization, how they are connected, and what services they run?

As a starting point, we assume that an attacker has a few IP addresses of ma- chines in the organization: Web servers, name servers, login servers, or any other machines that communicate with the outside world. The first thing the attacker will want to do is explore that server. Which TCP and UDP ports are open? An easy way to find out is simply to try and set up a TCP connection to each and every port number. If the connection is successful, there was a service listening. For instance, if the server replies on port 25, it suggests an SMTP server is present, if the con- nection succeeds on port 80, there will likely be a Web server, etc. We can use a similar technique for UDP (e.g., if the target replies on UDP port 53, we know it runs a domain name service because that is the port reserved for DNS).

Port Scanning

Probing a machine to see which ports are active is known as port scanning and may get fairly sophisticated. The technique we described earlier, where an at tacker sets up a full TCP connection to the target (a so-called connect scan) is not sophisticated at all. While effective, its major drawback is that it is very visible to the target’s security team. Many servers tend to log successful TCP connections, and showing up in logs during the reconnaissance phase is not what an attacker wants. To avoid this, she can make the connections deliberately unsuccessful by means of a half-open scan. A half-open scan only pretends to set up connections: it sends TCP packets with the SYN flag set to all port numbers of interest and waits for the server to send the corresponding SYN/ACKs for the ports that are open, but it never completes the three-way handshake. Most servers will not log these unsuccessful connection attempts.

If half-open scans are better than connect scans, why do we still discuss the lat ter? The reason is that half-open scans require more advanced attackers. A full connection to a TCP port is typically possible from most machines using simple tools such as telnet, that are often available to unprivileged users. For a half-open scan, however, attackers need to determine exactly which packets should and

should not be transmitted. Most systems do not have standard tools for nonprivi leged users to do this and only users with administrator privileges can perform a half-open scan.

Connect scans (sometimes referred to as open scans) and half-open scans both assume that it is possible to initiate a TCP connection from an arbitrary machine outside the victim’s network. However, perhaps the firewall does not allow con- nections to be set up from the attacker’s machine. For instance, it may block all

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 741

SYN segments. In that case, the attacker may have to resort to more esoteric scan- ning techniques. For instance, rather than a SYN segment, a FIN scan will send a TCP FIN segment, which is normally used to close a connection. At first sight, this does not make sense because there is no connection to terminate. However, the response to the FIN packet is often different for open ports (with listening services behind them) and closed ports. In particular, many TCP implementations send a TCP RST packet if the port is closed, and nothing at all if it is open. Fig. 8-2 illus trates these three basic scanning techniques.

SYN

SYN

FIN

Port 80 Port 80

Port 80

SYN/ACK ACK

SYN/ACK RST

Server Server Server

(a) Connect scan: connection established implies port is open

(b) Half open scan: SYN/ACK reply implies port open

(c) FIN scan: RST reply implies port is closed

Figure 8-2. Basic port scanning techniques. (a) Connect scan. (b) Half-open scan. (c) FIN scan.

By this time, you are probably thinking: ‘‘If we can do this with the SYN flags and the FIN flags, can we try some of the other flags?’’ You would be right. Any configuration that leads to different responses for open and closed ports works. A well-known other option is to set many flags at once (FIN, PSH, URG), something known as Xmas scan (because your packet is lit up like a Christmas tree).

Consider Fig. 8-2(a). If a connection can be established, it means the port is open. Now look at Fig. 8-2(b). A SYN/ACK reply implies the port is open. Final ly, we have Fig. 8-2(c). An RST reply means the port is open.

Probing for open ports is a first step. The next thing the attacker wants to know is exactly what server runs on this port, what software, what version of the soft- ware, and on what operating system. For instance, suppose we find that port 8080 is open. This is probably a Web server, although this is not certain. Even if it is a Web server, which one is it: Nginx, Lighttpd, Apache? Suppose an attacker only has an exploit for Apache version 2.4.37 and only on Windows, finding out all these details, known as fingerprinting is important. Just like in our port scans, we do so by making use of (sometimes subtle) differences in the way these servers and operating systems reply. If all of this sounds complicated, do not worry. Like many complicated things in computer networks, some helpful soul has sat down and

742 NETWORK SECURITY CHAP. 8

implemented all these scanning and fingerprinting techniques for you in friendly and versatile programs such as netmap and zmap.

Traceroute

Knowing which services are active on one machine is fine and dandy, but what about the rest of the machines in the network? Given knowledge of that first IP ad- dress, attackers may try to ‘‘poke around’’ to see what else is available. For in- stance, if the first machine has IP address 130.37.193.191, they might also try 130.37.193.192, 130.37.193.193, and all other possible addresses on the local net- work. Moreover, they can use programs such as traceroute to find the path toward the original IP address. Traceroute first sends a small batch of UDP packets to the target with the time-to-live (TTL) value set to one, then another batch with the TTL set to two, then a batch with a TTL of three, and so on. The first router lowers the TTL and immediately drops the first packets (because the TTL has now reached zero), and sends back an ICMP error message indicating that the packets have outlived their allocated life span. The second router does the same for the sec- ond batch of packets, the third for the third batch, until eventually some UDP pack- ets reach the target. By collecting the ICMP error packets and their source IP ad- dresses, traceroute is able to stitch together the overall route. Attackers can use the results to scan even more targets by probing address ranges of routers close to the target, thus obtaining a rudimentary knowledge of the network topology.

8.2.2 Sniffing and Snooping (with a Dash of Spoofing)

Many network attacks start with the interception of network traffic. For this at tack ingredient, we assume that the attacker has a presence in the victim’s network. For instance, the attacker brings a laptop in range of the victim’s WiFi network, or obtains access to a PC in the wired network. Sniffing on a broadcast medium, such as WiFi or the original Ethernet implementation is easy: you just tune into the channel at a convenient location, and listen for the bits come thundering by. To do so, attackers set their network interfaces in promiscuous mode, to make it accept all packets on the channel, even those destined for another host, and use tools such as tcpdump or Wireshark to capture the traffic.

Sniffing in Switched Networks

However, in many networks, things are not so easy. Take modern Ethernet as an example. Unlike its original incarnations, Ethernet today is no longer a proper shared-medium network technology. All communication is switched and attackers, even if they are connected to the same network segment, will never receive any of the Ethernet frames destined for the other hosts on the segment. Specifically, recall

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 743

that Ethernet switches are self-learning and quickly build up a forwarding table. The self-learning is simple and effective: as soon as an Ethernet frame from host A arrives at port 1, the switch records that traffic for host A should be sent on port 1. Now it knows that all traffic with host A’s MAC address in the destination field of the Ethernet header should be forwarded on port 1. Likewise, it will send the traffic for host B on port 2, and so on. Once the forwarding table is complete, the switch will no longer send any traffic explicitly addressed to host B on any port other than 2. To sniff traffic, attackers must find a way to make exactly that happen.

There are several ways for an attacker to overcome the switching problem. They all use spoofing. Nevertheless, we will discuss them in this section, since the sole goal here is to sniff traffic.

The first is MAC cloning, duplicating the MAC address of the host of which you want to sniff the traffic. If you claim to have this MAC address (by sending out Ethernet frames with that address), the switch will duly record this in its table and henceforth send all traffic bound for the victim to your machine instead. Of course, this assumes that you know this address, but you should be able to obtain it from the ARP requests sent by the target that are, after all, broadcast to all hosts in the network segment. Another complicating factor is that your mapping will be re- moved from the switch as soon as the original owner of the MAC address starts communicating again, so you will have to repeat this switch table poisoning con- stantly.

As an alternative, but in the same vein, attackers can use the fact that the switch table has a limited size and flood the switch with Ethernet frames with fake source addresses. The switch does not know the MAC addresses are fake and sim- ply records them until the table is full, evicting older entries to include the new ones if need be. Since the switch now no longer has an entry for the target host, it reverts to broadcast for all traffic towards it. MAC flooding makes your Ethernet behave like a broadcast medium again and party like it is 1979.

Instead of confusing the switch, attackers can also target hosts directly in a so-called ARP spoofing or ARP poisoning attack. Recall from Chap. 5 that the ARP protocol helps a computer find the MAC address corresponding to an IP ad- dress. For this purpose, the ARP implementation on a machine maintains a table with mappings from IP to MAC addresses for all hosts that have communicated with this machine (the ARP table). Each entry has a time-to-live (TTL) of, typi- cally, a few tens of minutes. After that, the MAC address of the remote party is silently forgotten, assuming there is no further communication between these par ties (in which case the TTL is reset), and all subsequent communication requires an ARP lookup first. The ARP lookup is simply a broadcast message that says something like: ‘‘Folks, I am looking for the MAC address of the host with IP ad- dress 192.168.2.24. If this is you, please let me know.’’ The lookup request con tains the requester’s MAC address, so host 192.168.2.24 knows where to send the reply, and also the requester’s IP address, so 192.168.2.24 can add the IP to MAC address of the requester to its own ARP table.

744 NETWORK SECURITY CHAP. 8

Whenever the attacker sees such an ARP request for host 192.168.2.24, she can race to supply the requester with her own MAC address. In that case, all com- munication for 192.168.2.24 will be sent to the attacker’s machine. In fact, since ARP implementations tend to be simple and stateless, the attacker can often just send ARP replies even if there was no request at all: the ARP implementation will accept the replies at face value and store the mappings in its ARP table.

By using this same trick on both communicating parties, the attacker receives all the traffic between them. By subsequently forwarding the frames to the right MAC addresses again, the attacker has installed a stealthy MITM (Man-in-the- Middle) gateway, capable of intercepting all traffic between the two hosts.

8.2.3 Spoofing (beyond ARP)

In general, spoofing means sending bytes over the network with a falsified source address. Besides ARP packets, attackers may spoof any other type of net- work traffic. For instance, SMTP (Simple Mail Transfer Protocol) is a friendly, text-based protocol that is used everywhere for sending email. It uses the Mail From: header as an indication of the source of an email, but by default it does not check this for correctness of the email address. In other words, you can put anything you want in this header. All replies will be sent to this address. Incidentally, the content of the Mail From: header is not even shown to the recipient of the email message. In- stead, your mail client shows the content of a separate From: header. However, there is no check on this field either, and SMTP allows you to falsify it, so that the email that you send to your fellow students informing them that they failed the course ap- pears to have been sent by the course instructor. If you additionally set the Mail From: header to your own email address, all replies sent by panicking students will end up in your mailbox. What fun you will have! Less innocently, criminals frequently spoof email to send phishing emails from seemingly trusted sources. That email from ‘‘your doctor’’ telling you to click on the link below to get urgent information about your medical test may lead to a site that says everything is nor- mal, but fails to mention that it just downloaded a virus to your computer. The one from ‘‘your bank’’ can be bad for your financial health.

ARP spoofing occurs at the link layer, and SMTP spoofing at the application layer, but spoofing may happen at any layer in the protocol stack. Sometimes, spoofing is easy. For instance, anyone with the ability to craft custom packets can create fake Ethernet frames, IP datagrams, or UDP packets. You only need to change the source address and that is it: these protocols do not have any way to detect the tampering. Other protocols are much more challenging. For instance, in TCP connections the endpoints maintain state, such as the sequence and acknowl- edgement numbers, that make spoofing much trickier. Unless the attacker can sniff or guess the appropriate sequence numbers, the spoofed TCP segments will be re jected by the receiver as ‘‘out-of-window.’’ As we shall see later, there are substan tial other difficulties as well.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 745

Even the simple protocols allow attackers to cause a lot of damage. Shortly, we will see how spoofed UDP packets may lead to devastating DoS denial-of-Service attacks. First, however, we consider how spoofing permits attackers to intercept what clients send to a server by spoofing UDP datagrams in DNS.

DNS Spoofing

Since DNS uses UDP for its requests and replies, spoofing should be easy. For instance, just like in the ARP spoofing attack, we could wait for a client to send a lookup request for domain trusted-services.com and then race with the legitimate domain name system to provide a false reply that informs the client that trust- ed-services.com is located at an IP address owned by us. Doing so is easy if we can sniff the traffic coming from the client (and, thus, see the DNS lookup request to which to respond), but what if we cannot see the request? After all, if we can al ready sniff the communication, intercepting it via DNS spoofing is not that useful. Also, what if we want to intercept the traffic of many people instead of just one?

The simplest solution, if attackers share the local name server of the victim, is that they send their own request for, say, trusted-services.com, which in turn will trigger the local name server to do a lookup for this IP address on their behalf by contacting the next name server in the lookup process. The attackers immediately ‘‘reply’’ to this request by the local name server with a spoofed reply that appears to come from the next name server. The result is that the local name server stores the falsified mapping in its cache and serves it to the victim when it finally does the lookup for trusted-services.com (and anyone else who may be looking up the same name). Note that even if the attackers do not share the local name, the attack may still work, if the attacker can trick the victim into doing a lookup request with the attacker-provided domain name. For instance, the attacker could send an email that urges the victim to click on a link, so that the browser will do the name lookup for the attacker. After poisoning the mapping for trusted-services.com, all subse- quent lookups for this domain will return the false mapping.

The astute reader will object that this is not so easy at all. After all, each DNS request carries a 16-bit query ID and a reply is accepted only if the ID in the reply matches. But if the attackers cannot see the request, they have to guess the identi fier. For a single reply, the odds of getting it right is one in 65,536. On average, an attacker would have to send tens of thousands of DNS replies in a very short time, to falsify a single mapping at the local name server, and do so without being noticed. Not easy.

Birthday Attack

There is an easier way that is sometimes referred to as a birthday attack (or birthday paradox, even though strictly speaking it is not a paradox at all). The idea for this attack comes from a technique that math professors often use in their

746 NETWORK SECURITY CHAP. 8

probability courses. The question is: how many students do you need in a class be fore the probability of having two people with the same birthday exceeds 50%? Most of us expect the answer to be way over 100. In fact, probability theory says it is just 23. With 23 people, the probability of none of them having the same birth- day is:

365

365 ×364

365 ×363

365 × . . . ×343

365 = 0. 497203

In other words, the probability of two students celebrating their birthday on the same day is over 50%.

More generally, if there is some mapping between inputs and outputs with n inputs (people, identifiers, etc.) and k possible outputs (birthdays, identifiers, etc.), there are n(n < 1)/2 input pairs. If n(n < 1)/2 > k, the chance of having at least one match is pretty good. Thus, approximately, a match is likely for n > 3}2}k. The key is that rather than look for a match for one particular student’s birthday, we compare everyone to everyone else and any match counts.

Using this insight, the attackers first send a few hundred DNS requests for the domain mapping they want to falsify. The local name server will try to resolve each of these requests individually by asking the next-level name server. This is perhaps not very smart, because why would you send multiple queries for the same domain, but few people have argued that name servers are smart, and this is how the popular BIND name server operated for a long time. Anyway, immediately after sending the requests, the attackers also send hundreds of spoofed ‘‘replies’’ for the lookup, each pretending to come from the next-level name server and carry ing a different guess for the query ID. The local name server implicitly performs the many-to-many comparison for us because if any reply ID matches that of a re- quest sent by the local name server, the reply will be accepted. Note how this scen- ario resembles that of the students’ birthdays: the name server compares all re- quests sent by the local name server with all spoofed replies.

By poisoning the local name server for a particular Web site, say, the attackers obtain access to the traffic sent to this site for all clients of the name server. By set ting up their own connections to the Web site and then relaying all communication from the clients and all communication from the server, they now serve as a stealthy man-in-the-middle.

Kaminsky Attack

Things may get even worse when attackers poison the mapping not just for a single Web site, but for an entire zone. The attack is known as Dan Kaminsky’s DNS attack and it caused a huge panic among information security officers and network administrators the world over. To see why everybody got their knickers in a twist, we should go into DNS lookups in a little more detail.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 747

Consider a DNS lookup request for the IP address of www.cs.vu.nl. Upon reception of this request, the local name server, in turn, sends a request either to the root name server or, more commonly, to the TLD (top-level domain) name server for the .nl domain. The latter is more common because the IP address of the TLD name server is often already in the local name server’s cache. Figure 8-3 shows this request by the local name server (asking for an ‘‘A record’’ for the domain) in a recursive lookup with query 1337.

UDP source port = x UDP destination port = 53

Transaction ID = 1337 Number of question = 1

Flags The flags indicate things like:

this is a standard query and

recursion is desired (RD = 1)

What is the A record of www.cs.vu.nl?

Figure 8-3. A DNS request for www.cs.vu.nl.

The TLD server does not know the exact mapping, but does know the names of the DNS servers of Vrije Universiteit which it sends back in a reply, since it does not do recursive lookups, thank you very much. The reply, shown in Fig. 8-4 has a few interesting fields to discuss. First, we observe, without going into details, that the flags indicate explicitly that the server does not want to do recursive lookups, so the remainder of the lookup will be iterative. Second, the query ID of the reply is also 1337, matching that of the lookup. Third, the reply provides the symbolic names of the name servers of the university ns1.vu.nl and ns2.vu.nl as NS records. These answers are authoritative and, in principle, suffice for the local name server to complete the query: by first performing a lookup for the A record of one of the name servers and subsequently contacting it, it can ask for the IP address of www.cs.vu.nl. However, doing so means that it will first contact the same TLD name server again, this time to ask for the IP address of the university’s name ser- ver, and as this incurs an extra round trip time, it is not very efficient. To avoid this extra lookup, the TLD name server helpfully provides the IP addresses of the two university name servers as additional records in its reply, each with a short TTL. These additional records are known as DNS glue records and are the key to the Kaminsky attack.

Here is what the attackers will do. First, they send lookup requests for a non- existing subdomain of the university domain like:: ohdeardankaminsky.vu.nl. Since the subdomain does not exist, no name server can provide the mapping from its

748 NETWORK SECURITY CHAP. 8

UDP source port = 53 UDP destination port = x

(same as in request!)

Transaction ID = 1337 Number of question = 1

Flags

The reply flags may indicate that this is a reply and recursion is not possible (RA = 0)

Number of answers = 0

Number of resource records of authoritative servers = 2Number of resource records with additional info = 2

What is the A record of www.cs.vu.nl?

Authoritative server: ns1.vu.nl

Authoritative server: ns2.vu.nl

Additional/glue record: ns1.vu.nl 130.37.129.4

Additional/glue record: ns2.vu.nl 130.37.129.5

Figure 8-4. A DNS reply sent by the TLD name server.

cache. The local name server will instead contact the TLD name server. Im- mediately after sending the requests, the attackers also send many spoofed replies, pretending to be from the TLD name server, just like in a regular DNS spoofing re- quest, except this time, the reply indicates that the TLD name server does not know the answer (i.e., it does not provide the A record), does not do recursive lookups, and advises the local name server to complete the lookup by contacting one of the university name servers. It may even provide the real names of these name servers. The only things they falsify are the glue records, for which they supply IP ad- dresses that they control. As a result, every lookup for any subdomain of .vu.nl will contact the attackers’ name server which can provide a mapping to any IP address it wants. In other words, the attackers are able to operate as man-in-the-middle for any site in the university domain!

While not all name server implementations were vulnerable to this attack, most of them were. Clearly, the Internet had a problem. An emergency meeting was hastily organized in Microsoft’s headquarters in Redmond. Kaminsky later stated that all of this was shrouded in such secrecy that ‘‘there were people on jets to Microsoft who didn’t even know what the bug was.’’

So how did these clever people solve the problem? The answer is, they didn’t, not really. What they did do is make it harder. Recall that a core problem of these DNS spoofing attacks is that the query ID is only 16 bits, making it possible to guess it, either directly or by means of a birthday attack. A larger query ID makes the attack much less likely to succeed. However, simply changing the format of the DNS protocol message is not so easy and would also break many existing systems.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 749

The solution was to extend the length of the random ID without really extending the query ID, by instead introducing randomness also in the UDP source port. When sending out a DNS request to, say, the TLD name server, a patched name server would pick a random port out of thousands of possible port numbers and use that as the UDP source port. Now the attacker must guess not just the query ID, but also the port number and do so before the legitimate reply arrives. The 0x20 encod ing that we described in Chap. 7 exploits the case-insensitive nature of DNS queries to add even more bits to the transaction ID.

Fortunately, DNSSEC, provides a more solid defense against DNS spoofing. DNSSEC consists of a collection of extensions to DNS that offer both integrity and origin authentication of DNS data to DNS clients. However, DNSSEC deployment has been extremely slow. The initial work on DNSSEC was conducted in the early 1990s and the first RFC was published by the IETF in 1997; DNSSEC is now start ing to see more widespread deployment, as we will discuss later in this chapter.

TCP Spoofing

Compared to the protocols discussed so far, spoofing in TCP is infinitely more complicated. When attackers want to pretend that a TCP segment came from an- other computer on the Internet, they not only have to guess the port number, but also the correct sequence numbers. Moreover, keeping a TCP connection in good shape, while injecting spoofed TCP segments is very complicated. We distinguish between two cases:

1. Connection spoofing. The attacker sets up a new connection, pre tending to be someone at a different computer.

2. Connection hijacking. The attacker injects data in a connection that already exists between two parties, pretending to be either of these two parties.

The best-known example of TCP connection spoofing was the attack by Kevin Mitnick against the San Diego Supercomputing Center (SDSC) on Christ- mas day 1994. It is one of the most famous hacks in history, and the subject of sev- eral books and movies. Incidentally, one of them is a fairly big-budget flick called ‘‘Takedown,’’ that is based on a book that was written by the system administrator of the Supercomputing Center. (Perhaps not surprisingly, the administrator in the movie is portrayed as a very cool guy). We discuss it here because it illustrates the difficulties in TCP spoofing quite well.

Kevin Mitnick had a long history of being an Internet bad boy before he set his sights on SDSC. Incidentally, attacking on Christmas day is generally a good idea because on public holidays there are fewer users and administrators around. After some initial reconnaissance, Mitnick discovered that an (X-terminal) computer in SDSC had a trust relationship with another (server) machine in the same center.

750 NETWORK SECURITY CHAP. 8

Fig. 8-5(a) shows the configuration. Specifically, the server was implicitly trusted and anyone on the server could log in on the X-terminal as administrator using re- mote shell (rsh) without the need to enter a password. His plan was to set up a TCP connection to the X-terminal, pretending to be the server and use it to turn off pass- word protection altogether—in those days, this could be done by writing ‘‘+ +’’ in the .rhosts file.

Doing so, however, was not easy. If Mitnick had sent a spoofed TCP con- nection setup request (a SYN segment) to the X-terminal with the IP address of the server (step 1 in Fig. 8-5(b)), the X-terminal would have sent its SYN/ACK reply to the actual server, and this reply would have been invisible to Mitnick (step 2 in Fig. 8-5(b)). As a result, he would not know the X-terminal’s initial sequence number (ISN), a more-or-less random number that he would need for the third phase of the TCP handshake (which as we saw earlier, is the first segment that may contain data). What is worse, upon reception of the SYN/ACK, the server would have immediately responded with an RST segment to terminate the connection set- up (step 3 in Fig. 8-5(c)). After all, there must have been a problem, as it never

sent a SYN segment.

(Server can login

without password)

Trusted

X-terminal

server

Mitnick

Not visible to Mitnick

2. SYN/ACK

Trusted

server X-terminal 1. Spoofed SYN

Mitnick

Terminate handshake

3. RST

Trusted

server X-terminal Mitnick

(a) (b) (c)

Figure 8-5. Challenges faced by Kevin Mitnick during the attack on SDSC.

Note that the problem of the invisible SYN/ACK, and hence the missing initial sequence number (ISN), would not be a problem at all if the ISN would have been predictable. For instance, if it would start at 0 for every new connection. However, since the ISN was chosen more or less random for every connection, Mitnick need- ed to find out how it was generated in order to predict the number that the X-termi- nal would use in its invisible SYN/ACK to the server.

To overcome these challenges, Mitnick launched his attack in several steps. First, he interacted extensively with the X-terminal using nonspoofed SYN mes- sages (step 1 in Fig. 8-6(a)). While these TCP connection attempts did not get him access to the machine, they did give him a sequence of ISNs. Fortunately for Kevin, the ISNs were not that random. He stared at the numbers for a while until he found a pattern and was confident that given one ISN, he would be able to pre- dict the next one. Next, he made sure that the trusted server would not be able to reset his connection attempts by launching a DoS attack that made the server unre- sponsive (step 2 in Fig. 8-6(b)). Now the path was clear to launch his real attack.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 751

After sending the spoofed SYN packet (step 3 in Fig. 8-6(b)), he predicted the se- quence number that the X-terminal would be using in its SYN/ACK reply to the server (step 4 in Fig. 8-6(b)) and used this in the third and final step, where he sent the command echo ‘‘+ +’’ >> .rhosts as data to the port used by the remote shell daemon (step 5 in Fig. 8-6(c)). After that, he could log in from any machine with- out a password.

(No RST)

Trusted

server X-terminal

4. SYN/ACK

Trusted

server X-terminal

Trusted

server X-terminal

1. Guess ISN5. Third phase of TCP handshake with guessed ACK number and

3. Spoofed

2. KILL! KILL!

Mitnick

SYN

Mitnick

KILL!

data: echo + + >> .rhosts Mitnick

(a) (b) (c)

Figure 8-6. Mitnick’s attack

Since one of the main weaknesses exploited by Mitnick was the predictability of TCP’s initial sequence numbers, the developers of network stacks have since spent much effort on improving the randomness of TCP’s choice for these securi ty-sensitive numbers. As a result, the Mitnick attack is no longer practical. Modern attackers need to find a different way to guess the initial sequence numbers, for instance, the one employed in the connection hijacking attack we describe no

TCP Connection Hijacking

Compared to connection spoofing, connection hijacking adds even more hur- dles to overcome. For now, let us assume that the attackers are able to eavesdrop on an existing connection between two communicating parties (because they are on the same network segment) and therefore know the exact sequence numbers and all other relevant information related to this communication. In a hijacking attack, the aim is to take over an existing connection, by injecting data into the stream.

To make this concrete, let us assume that the attacker wants to inject some data into the TCP connection that exists between a client who is logged in to a Web ap- plication at a server with the aim of making either the client or server receive at tacker-injected bytes. In our example, the sequence numbers of the last bytes sent by the client and server are 1000 and 12,500, respectively. Assume that all data re- ceived so far have been acknowledged and the client and server are not currently sending any data. Now the attacker injects, say, 100 bytes into the TCP stream to the server, by sending a spoofed packet with the client’s IP address and source port, as well as the server’s IP address and source port. This 4-tuple is enough to make the network stack demultiplex the data to the right socket. In addition, the attacker provides the appropriate sequence number (1001) and acknowledgement number (12501), so TCP will pass the 100-byte payload to the Web server.

752 NETWORK SECURITY CHAP. 8

However, there is a problem. After passing the injected bytes to the applica tion, the server will acknowledge them to the client: ‘‘Thank you for the bytes, I am now ready to receive byte number 1101.’’ This message comes as a surprise to the client, who thinks the server is confused. After all, it never sent any data, and still intends to send byte 1001. It promptly tells the server so, by sending an empty segment with sequence number 1001 and acknowledgement number 12501. ‘‘Wow’’ says the server, ‘‘thanks, but this looks like an old ACK. By now, I already received the next 100 bytes. Best tell the remote party about this.’’ It resends the ACK (seq = 1101, ack = 12501), which leads to another ACK by the client, and so on. This phenomenon is known as an ACK storm. It will never stop until one of the ACKs gets lost (because TCP does not retransmit dataless ACKs).

How does the attacker quell the ACK storm? There are several tricks and we will discuss all of them. The simplest one is to tear down the connection explicitly by sending an RST segment to the communicating parties. Alternatively, the at tacker may be able to use ARP poisoning to cause one of the ACKs to be sent to a nonexisting address, forcing it to get lost. An alternative strategy is to desynchro- nize the two sides of the connection so much that all data sent by the client will be ignored by the server and vice versa. Doing so by sending lots of data is quite in- volved, but an attacker can easily accomplish this at the connection setup phase. The idea is as follows. The attacker waits until the client sets up a connection to the server. As soon as the server replies with a SYN/ACK, the attacker sends it an RST packet to terminate the connection, immediately followed by a SYN packet, with the same IP address and TCP source port as the ones originally used by the client, but a different client-side sequence number. After the subsequent SYN/ACK by the server, the server and client are both in the established state, but they cannot com- municate with each other, because their sequence numbers are so far apart that they are always out-of-window. Instead, the attacker plays the role of man-in-the-middle and relays data between the two parties, able to inject data at will.

Off-Path TCP Exploits

Some of the attacks are very complex and hard to even understand, let alone defend against. In this section we will look at one of the more complicated ones. In most cases, attackers are not on the same network segment and cannot sniff the traffic between the parties. Attacks in such a scenario are known as off-path TCP exploits and are very tricky to pull off. Even if we ignore the ACK storm, the at tacker needs a lot of information to inject data into an existing connection:

1. Even before the actual attack, the attackers should discover that there is a connection between two parties on the Internet to begin with.

2. Then they should determine the port numbers to use.

3. Finally, they need the sequence numbers.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 753

Quite a tall order, if you are on the other side of the Internet, but not necessar ily impossible, though. Decades after the Mitnick attack on SDSC, security re- searchers discovered a new vulnerability that permitted them to perform an off-

path TCP exploit on widely deployed Linux systems. They described their attack in a paper titled ‘‘Off-Path TCP Exploits: Global Rate Limit Considered Dangerous,’’ which is a very apt title, as we shall see. We discuss it here because it illustrates that secret information can sometimes leak in an indirect way.

Ironically, the attack was made possible by a novel feature that was supposed to make the system more secure, not less secure. Recall that we said off-path data injections were very difficult because the attacker had to guess the port numbers and the sequence numbers and getting this right in a brute force attack is unlikely. Still, you just might get it right. Especially since you do not even have to get the sequence number exactly right, as long as the data you send is ‘‘in-window.’’ This means that with some (small) probability, attackers may reset, or inject data into existing connections. In August 2010, a new TCP extension appeared in the form of RFC 5961 to remedy this problem.

RFC 5961 changed how TCP handled the reception of SYN segments, RST segments, and regular data segments. The reason that the vulnerability existed only in Linux is that only Linux implemented the RFC correctly. To explain what it did, we should consider first how TCP worked before the extension. Let us consid- er the reception of SYN segments first. Before RFC 5961, whenever TCP received a SYN segment for an already existing connection, it would discard the packet if it was out-of-window, but it would reset the connection if it was in-window. The rea- son is that upon receiving a SYN segment, TCP would assume that the other side had restarted and thus that the existing connection was no longer valid. This is not good, as an attacker only needs to get one SYN segment with a sequence number somewhere in the receiver window to reset a connection. What RFC 5961 propos- ed instead was to not reset the connection immediately, but first send a challenge ACK to the apparent sender of the SYN. If the packet did come from the legitimate remote peer, it means that it really did lose the previous connection and is now set ting up a new one. Upon receiving the challenge ACK, it will therefore send an RST packet with the correct sequence number. The attackers cannot do this since they never received the challenge ACK.

The same story holds for RST segments. In traditional TCP, hosts would drop the RST packets if they are out-of-window, and reset the connection if they are in-window. To make it harder to reset someone else’s connection, RFC 5961 pro- posed to reset the connection immediately only if the sequence number in the RST segment was exactly the one at the start of the receiver window (i.e., next expected sequence number). If the sequence number is not an exact match, but still in-win- dow, the host does not drop the connection, but sends a challenge ACK. If the sender is legitimate, it will send a RST packet with the right sequence number.

Finally, for data segments, old-style TCP conducts two checks. First, it checks the sequence number. If that was in-window, it also checks the acknowledgement

754 NETWORK SECURITY CHAP. 8

number. It considers acknowledgement numbers valid as long as they fall in an (enormous) interval. Let us denote the sequence numbers of the first unacknow ledged byte by FUB and the sequence number of the next byte to be sent by NEXT. All packets with acknowledgement numbers in [FUB < 2GB, NEXT] are valid, or half the ACK number space. This is easy to get right for an attacker! Moreover, if the acknowledgement number also happens to be in-window, it would process the data and advance the window in the usual way. Instead, RFC 5961 says that while we should accept packets with acknowledgement numbers that are (roughly) in-window, we should send challenge ACKs for the ones that are in the window [FUB < 2GB, FUB < MAXWIN], where MAXWIN is the largest window ever advertised by the peer.

The designers of the protocol extension quickly recognized that it may lead to a huge number of challenge ACKs, and proposed ACK throttling as a solution. In the implementation of Linux, this meant that it would send at most 100 challenge ACKs per second, across all connections. In other words, a global variable shared by all connections kept track of how many challenge ACKs were sent and if the counter reached 100, it would send no more challenge ACKs for that one-second interval, whatever happened.

All this sounds good, but there is a problem. A single global variable repres- ents shared state that can serve as a side channel for clever attacks. Let us take the first obstacle the attackers must overcome: are the two parties communicating? Recall that a challenge ACK is sent in three scenarios:

1. A SYN segment has the right source and destination IP addresses and port numbers, regardless of the sequence number.

2. A RST segment where the sequence number is in-window.

3. A data segment where additionally the acknowledgement number is in the challenge window.

Let us say that the attackers want to know whether a user at 130.37.20.7 is talking to a Web server (destination port 80) at 37.60.194.64. Since the attackers need not get the sequence number right, they only need to guess the source port number. To do so, they set up their own connection to the Web server and send 100 RST pack- ets in quick succession, in response to which the server sends 100 challenge ACKs, unless it has already sent some challenge ACKs, in which case it would send fewer. However, this is quite unlikely. In addition to the 100 RSTs, the attackers therefore send a spoofed SYN segment, pretending to be the client at 130.37.20.7, with a guessed port number. If the guess is wrong, nothing happens and the attackers will still receive the 100 challenge ACKs. However, if they guessed the port number correctly, we end up in scenario (1), where the server sends a challenge ACK to the legitimate client. But since the server can only send 100 challenge ACKs per sec- ond, this means that the attackers receive only 99. In other words, by counting the number of challenge ACKs, the attackers can determine not just that the two hosts

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 755

are communicating, but even the (hidden) source port number of the client. Of course, you need quite a few tries to get it right, but this is definitely doable. Also, there are various techniques to make this more efficient.

Once the attackers have the port number they can move to the next phase of the attack: guessing the sequence and acknowledgement numbers. The idea is quite similar. For the sequence number the attackers again send 100 legitimate RST packets (spurring the server into sending challenge ACKs) and an additional spoof- ed RST packet with the right IP addresses and now known port numbers, as well as a guessed sequence number. If the guess is in-window, we are in scenario 2. Thus, by counting the challenge ACKs the attackers receive, they can determine whether the guess was correct.

Finally, for the acknowledgement number they send, in addition to the 100 RST packets, a data packet with all fields filled in correctly, but with a guess for the acknowledgement number, and apply the same trick. Now the attackers have all the information they need to reset the connection, or inject data.

The off-path TCP attack is a good illustration of three things. First, it shows how crazy complicated network attacks may get. Second, it is an excellent example of a network-based side-channel attack. Such attacks leak important information in an indirect way. In this case, the attackers learned all the connection details by counting something that appears very unrelated. Third, the attack shows that global shared state is the core problem of such side-channel attacks. Side-channel vulner- abilities appear everywhere, in both software and hardware, and in all cases, the root cause is the sharing of some important resource. Of course, we knew this al ready, as it is a violation of Saltzer and Schroeder’s general principle of least com- mon mechanism which we discussed in the beginning of this chapter. From a se- curity perspective, it is good to remember that often sharing is not caring!

Before we move to the next topic (disruption and denial of service), it is good to know that data injection is not just nice in theory, it is actively used in practice. After the revelations by Edward Snowden in 2013, it became clear that the NSA (National Security Agency) ran a mass surveillance operation. One of its activities was Quantum, a sophisticated network attack that used packet injection to redirect targeted users connecting to popular services (such as Twitter, Gmail, or Facebook) to special servers that would then hack the victims’ computers to give the NSA complete control. NSA denies everything, of course. It almost even denies its own existence. An industry joke goes:

Q: What does NSA stand for?

A: No Such Agency

8.2.4 Disruption

Attacks on availability are known as denial-of-service" attacks. They occur when a victim receives data it cannot handle, and as a result, becomes unrespon- sive. There are various reasons why a machine may stop responding:

756 NETWORK SECURITY CHAP. 8

1. Crashes. The attacker sends content that causes the victim to crash or hang. An example of such an attack was the ping of death we dis- cussed earlier.

2. Algorithmic complexity. The attacker sends data that is crafted spe- cifically to create a lot of (algorithmic) overhead. Suppose a server al lows clients to send rich search queries. In that case, an algorithmic complexity attack may consist of a number of complicated regular expressions that incur the worst-case search time for the server.

3. Flooding/swamping. The attacker bombards the victim with such a massive flood of requests or replies that the poor system cannot keep up. Often, but not always, the victim eventually crashes.

Flooding attacks have become a major headache for organizations because these days it is very easy and cheap to carry out large-scale DoS attacks. For a few dollars or euros, you can rent a botnet consisting of many thousands of machines to attack any address you like. If the attack data is sent from a large number of dis tributed machines, we refer to the attack as a DDoS, (Distributed Denial-of-Ser- vice) attack. Specialized services on the Internet, known as booters or stressers, offer user-friendly interfaces to help even nontechnical users to launch them.

SYN Flooding

In the old days, DDoS attacks were quite simple. For instance, you would use a large number of hacked machines to launch a SYN flooding attack. All of these machines would send TCP SYN segments to the server, often spoofed to make it appear as if they came from different machines. While the server responded with a SYN/ACK, nobody would complete the TCP handshake, leaving the server dan- gling. That is quite expensive. A host can only keep a limited number of con- nections in the half-open state. After that, it no longer accepts new connections.

There are many solutions for SYN flooding attacks. For instance, we may sim- ply drop half-open connections when we reach a limit to give preference to new connections or reduce the SYN-received timeout. An elegant and very simple solution, supported by many systems today goes by the name of SYN cookies, also briefly discussed in Chap. 6. Systems protected with SYN cookies use a special algorithm to determine the initial sequence number in such a way that the server does not need to remember anything about a connection until it receives the third packet in the three-way handshake. Recall that a sequence number is 32 bits wide. With SYN cookies, the server chooses the initial sequence number as follows:

1. The top 5 bits are the value of t modulo 32, where t is a slowly incre- menting timer (e.g., a timer that increases every 64 seconds).

2. The next 3 bits are an encoding of the MSS (maximum segment size), giving eight possible values for the MSS.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 757

3. The remaining 24 bits are the value of a cryptographic hash over the timestamp t and the source and destination IP addresses and port numbers.

The advantage of this sequence number is that the server can just stick it in a SYN/ACK and forget about it. If the handshake never completes, it is no skin off its back (or off whatever it is the server has on its back). If the handshake does complete, containing its own sequence number plus one in the acknowledgement, the server is able to reconstruct all the state it requires to establish the connection. First, it checks that the cryptographic hash matches a recent value of t and then quickly rebuilds the SYN queue entry using the MSS encoded in the 3 bits. While SYN Cookies allow only eight different segment sizes and make the sequence number grow faster than usual, the impact is minimal in practice. What is particu larly nice is that the scheme is compatible with normal TCP and does not require the client to support the same extension.

Of course, it is still possible to launch a DDoS attack even in the presence of SYN cookies by completing the handshake, but this is more expensive for the at tackers (as their own machines have limits on open TCP connections also), and more importantly, prevents TCP attacks with spoofed IP addresses.

Reflection and Amplification in DDoS Attacks

However, TCP-based DDoS attacks are not the only game in town. In recent years, more and more of the large-scale DDoS attacks have used UDP as the tran- sport protocol. Spoofing UDP packets is typically easy. Moreover, with UDP it is possible to trick legitimate servers on the Internet to launch so-called reflection attacks on a victim. In a reflection attack, the attacker sends a request with a spoofed source address to a legitimate UDP service, for instance, a name server. The server will then reply to the spoofed address. If we do this from a large num- ber of servers, the deluge of UDP reply packets is more than likely to take down the victim. Reflection attacks have two main advantages.

1. By adding the extra level of indirection, the attacker makes it difficult for the victim to block the senders somewhere in the network (after all, the senders are all legitimate servers).

2. Many services can amplify the attack by sending large replies to small requests.

These amplification-based DDoS attacks have been responsible for some of the largest volumes of DDoS attack traffic in history, easily reaching into the Terabit-per-second range. What the attacker must do for a successful amplification

attack is to look for publicly accessible services with a large amplification factor. For instance, where one small request packet becomes a large reply packet, or

758 NETWORK SECURITY CHAP. 8

better still, multiple large reply packets. The byte amplification factor represents the relative gain in bytes, while the packet amplification factor represents the rela tive gain packets. Figure 8-7 shows the amplification factors for several popular protocols. While these numbers may look impressive, it is good to remember that these are averages and individual servers may have even higher ones. Interestingly, DNSSEC, the protocol that was intended to fix the security problems of DNS, has a much higher amplification factor than plain old DNS, exceeding 100 for some servers. Not to be outdone, misconfigured memcached servers (fast in-memory databases), clocked an amplification factor well exceeding 50,000 during a massive amplification attack of 1.7 Tbps in 2018.

Protocol Byte amplification Packet amplification

NTP 556.9 3.8

DNS 54.6 2.1

Bittorrent 3.8 1.6

Figure 8-7. Amplification factors for popular protocols

Defending against DDoS Attacks

Defending against such enormous streams of traffic is not easy, but several defenses exist. One, fairly straightforward technique is to block traffic close to the source. The most common way to do so is using a technique called egress filter ing, whereby a network device such as a firewall blocks all outgoing packets whose source IP addresses do not correspond to those inside the network where it is attached. This, of course, requires the firewall to know what packets could possi- bly arrive with a particular source IP address, which is typically only possible at the edge of the network; for example, a university network might know all IP address ranges on its campus network and could thus block outgoing traffic from any IP address that it did not own. The dual to egress filtering is ingress filtering, whereby a network device filters all incoming traffic with internal IP addresses.

Another measure we can take is to try and absorb the DDoS attack with spare capacity. Doing so is expensive and may be unaffordable on an individual basis, for all but the biggest players. Fortunately, there is no reason to do this individually. By pooling resources that can be used by many parties, even smaller players can afford DDoS protection. Like insurance, the assumption is that not everybody will be attacked at the same time.

So what insurance will you get? Several organizations offer to protect your Web site by means of cloud-based DDoS protection which uses the strength of the cloud, to scale up capacity as and when needed, to fend off DoS attacks. At its core, the defense consists of the cloud shielding and even hiding the IP address of the real server. All requests are sent to proxies in the cloud that filter out the

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 759

malicious traffic the best they can (although doing so may not be so easy for ad- vanced attacks), and forward the benign requests to the real server. If the number of requests or the amount of traffic for a specific server increases, the cloud will al locate more resources to handling these packets. In other words, the cloud ‘‘absorbs’’ the flood of data. Typically, it may also operate as a scrubber to sani tize the data as well. For instance, it may remove overlapping TCP segments or weird combinations of TCP flags, and serve in general as a WAF (Web Applica tion Firewall).

To relay the traffic via the cloud-based proxies Web site owners can choose be tween several options with different price tags. If they can afford it, they can opt for BGP blackholing. In this case, the assumption is that the Web site owner con trols an entire /24 block of (16,777,216) addresses. The idea is that the owner sim- ply withdraws the BGP announcements for that block from its own routers. In- stead, the cloud-based security provider starts announcing this IP from its network, so that all traffic for the server will go to the cloud first. However, not everybody has entire network blocks to play around with, or can afford the cost of BGP rerouting. For them, there is the more economical option to use DNS rerouting. In this case, the Web site’s administrators change the DNS mappings in their name servers to point to servers in the cloud, rather than the real server. In either case, visitors will send their packets to the proxies owned by the cloud-based security provider first and these cloud-based proxies subsequently forward the packets to the real server.

DNS rerouting is easier to implement, but the security guarantees of the cloud- based security provider are only strong if the real IP address of the server remains hidden. If the attackers obtain this address, they can bypass the cloud and attack the server directly. Unfortunately, there are many ways in which the IP address may leak. Like FTP, some Web applications send the IP address to the remote party in-band, so there is not a lot one could do in those cases. Alternatively, attackers could look at historical DNS data to see what IP addresses were registered for the server in the past. Several companies collect and sell such historical DNS data.

8.3 FIREWALLS AND INTRUSION DETECTION SYSTEMS

The ability to connect any computer, anywhere, to any other computer, any- where, is a mixed blessing. For individuals at home, wandering around the Internet is lots of fun. For corporate security managers, it is a nightmare. Most companies have large amounts of confidential information online—trade secrets, product de- velopment plans, marketing strategies, financial analyses, tax records, etc. Disclo- sure of this information to a competitor could have dire consequences.

In addition to the danger of information leaking out, there is also a danger of information leaking in. In particular, viruses, worms, and other digital pests can breach security, destroy valuable data, and waste large amounts of administrators’

760 NETWORK SECURITY CHAP. 8

time trying to clean up the mess they leave. Often they are imported by careless employees who want to play some nifty new game.

Consequently, mechanisms are needed to keep ‘‘good’’ bits in and ‘‘bad’’ bits out. One method is to use encryption, which protects data in transit between secure sites. However, it does nothing to keep digital pests and intruders from get ting onto the company’s LAN. To see how to accomplish this goal, we need to look at firewalls.

8.3.1 Firewalls

Firewalls are just a modern adaptation of that old medieval security standby: digging a wide and deep moat around your castle. This design forced everyone entering or leaving the castle to pass over a single drawbridge, where they could be inspected by the I/O police. With networks, the same trick is possible: a company can have many LANs connected in arbitrary ways, but all traffic to or from the company is forced through an electronic drawbridge (firewall), as shown in Fig. 8-8. No other route exists.

Internal network DeMilitarized zone External

Firewall

Security

perimeter

Web

Email

server

server

Internet

Figure 8-8. A firewall protecting an internal network.

The firewall acts as a packet filter. It inspects each and every incoming and outgoing packet. Packets meeting some criterion described in rules formulated by the network administrator are forwarded normally. Those that fail the test are unceremoniously dropped.

The filtering criterion is typically given as rules or tables that list sources and destinations that are acceptable, sources and destinations that are blocked, and de fault rules about what to do with packets coming from or going to other machines. In the common case of a TCP/IP setting, a source or destination might consist of an IP address and a port. Ports indicate which service is desired. For example, TCP port 25 is for mail, and TCP port 80 is for HTTP. Some ports can simply be blocked outright. For example, a company could block incoming packets for all IP addresses combined with TCP port 79. It was once popular for the Finger service

SEC. 8.3 FIREWALLS AND INTRUSION DETECTION SYSTEMS 761

to look up people’s email addresses but is barely used today due to its role in a now-infamous (accidental) attack on the Internet in 1988.

Other ports are not so easily blocked. The difficulty is that network adminis trators want security but cannot cut off communication with the outside world. That arrangement would be much simpler and better for security, but there would be no end to user complaints about it. This is where arrangements such as the DMZ (DeMilitarized Zone) shown in Fig. 8-8 come in handy. The DMZ is the part of the company network that lies outside of the security perimeter. Anything goes here. By placing a machine such as a Web server in the DMZ, computers on the Internet can contact it to browse the company Web site. Now the firewall can be configured to block incoming TCP traffic to port 80 so that computers on the In ternet cannot use this port to attack computers on the internal network. To allow the Web server to be managed, the firewall can have a rule to permit connections between internal machines and the Web server.

Firewalls have become much more sophisticated over time in an arms race with attackers. Originally, firewalls applied a rule set independently for each pack- et, but it proved difficult to write rules that allowed useful functionality but blocked all unwanted traffic. Stateful firewalls map packets to connections and use TCP/IP header fields to keep track of connections. This allows for rules that, for example, allow an external Web server to send packets to an internal host, but only if the internal host first establishes a connection with the external Web server. Such a rule is not possible with stateless designs that must either pass or drop all packets from the external Web server.

Another level of sophistication up from stateful processing is for the firewall to implement application-level gateways. This processing involves the firewall looking inside packets, beyond even the TCP header, to see what the application is doing. With this capability, it is possible to distinguish HTTP traffic used for Web browsing from HTTP traffic used for peer-to-peer file sharing. Administrators can write rules to spare the company from peer-to-peer file sharing but allow Web browsing that is vital for business. For all of these methods, outgoing traffic can be inspected as well as incoming traffic, for example, to prevent sensitive documents from being emailed outside of the company.

As the above discussion should make abundantly clear, firewalls violate the standard layering of protocols. They are network layer devices, but they peek at the transport and applications layers to do their filtering. This makes them fragile. For instance, firewalls tend to rely on standard port numbering conventions to de termine what kind of traffic is carried in a packet. Standard ports are often used, but not by all computers, and not by all applications either. Some peer-to-peer ap- plications select ports dynamically to avoid being easily spotted (and blocked). Moreover, encryption hides higher-layer information from the firewall. Finally, a firewall cannot readily talk to the computers that communicate through it to tell them what policies are being applied and why their connection is being dropped. It must simply pretend that it is a broken wire. For these reasons, networking purists

762 NETWORK SECURITY CHAP. 8

consider firewalls to be a blemish on the architecture of the Internet. However, the Internet can be a dangerous place if you are a computer. Firewalls help with that problem, so they are likely to stay.

Even if the firewall is perfectly configured, plenty of security problems still exist. For example, if a firewall is configured to allow in packets from only specif ic networks (e.g., the company’s other plants), an intruder outside the firewall can spoof the source addresses to bypass this check. If an insider wants to ship out secret documents, he can encrypt them or even photograph them and ship the pho tos as JPEG files, which bypasses any email filters. And we have not even dis- cussed the fact that, although three-quarters of all attacks come from outside the firewall, the attacks that come from inside the firewall, for example, from disgrun tled employees, may be the most damaging (Verizon, 2009).

A different problem with firewalls is that they provide a single perimeter of defense. If that defense is breached, all bets are off. For this reason, firewalls are often used in a layered defense. For example, a firewall may guard the entrance to the internal network and each computer may also run its own firewall, too. Reade rs who think that one security checkpoint is enough clearly have not made an inter- national flight on a scheduled airline recently. As a result, many networks now have multiple levels of firewall, all the way down to per-host firewalls—a simple example of defense in depth. Suffice it to say that in both airports and computer networks if attackers have to compromise multiple independent defenses, it is much harder for them to breach the entire system.

8.3.2 Intrusion Detection and Prevention

Besides firewalls and scrubbers, network administrators may deploy a variety of other defensive measures, such as intrusion detection systems and intrusion pre- vention systems, to be described shortly. As the name implies, the role of an IDS (Intrusion Detection System) is to detect attacks—ideally before they can do any damage. For instance, they may generate warnings early on, at the onset of an at tack, when it observes port scanning or a brute force ssh password attack (where an attacker simply tries many popular passwords to try and log in), or when the IDS finds the signature of the latest and greatest exploit in a TCP connection. However, it may also detect attacks only at a later stage, when a system has already been compromised and now exhibits unusual behavior.

We can categorize intrusion detection systems by considering where they work and how they work. A HIDS (Host-based IDS) works on the end-point itself, say a laptop or server, and scans, for instance, the behavior of the software or the net- work traffic to and from a Web server only on that machine. In contrast, a NIDS (Network IDS) checks the traffic for a set of machines on the network. Both have advantages and disadvantages.

A NIDS is attractive because it protects many machines, with the ability to cor relate events associated with different hosts, and does not use up resources on the

SEC. 8.3 FIREWALLS AND INTRUSION DETECTION SYSTEMS 763

machines it protects. In other words, the IDS has no impact on the performance of the machines in its protection domain. On the other hand, it is difficult to handle issues that are system specific. As an example, suppose that a TCP connection con tains overlapping TCP segments: packet A contains bytes 1–200 while packet B contains bytes 100–300. Clearly, there is overlap between the bytes in the pay loads. Let us also assume that the bytes in the overlapping region are different. What is the IDS to do?

The real question is: which bytes will be used by the receiving host? If the host uses the bytes of packet A, the IDS should check these bytes for malicious content and ignore the ones in packet B. However, what if the host instead uses the bytes in packet B? And what if some hosts in the network take the bytes in packet A and some take the bytes in packet B? Even if the hosts are all the same and the IDS knows how they reassemble the TCP streams there may still be difficulties. Sup- pose all hosts will normally take the bytes in packet A. If the IDS looks at that packet, it is still wrong if the destination of the packet is two or three network hops away, and the TTL value in packet A is 1, so it never even reaches its destination. Tricks that attackers play with TTL, or with overlapping byte ranges in IP frag- ments or TCP segments, are called IDS evasion techniques.

Another problem with NIDS is encryption. If the network bytes are no longer decipherable, it becomes much harder for the IDS to determine if they are mali- cious. This is another example of one security measure (encryption) reducing the protection offered by another (IDS). As a work-around, administrators may give the IDS the encryption keys to the NIDS. This works, but is not ideal, as it creates additional key management headaches. Also, observe that the IDS sees all the net- work traffic and tends to contain a great many lines of code itself. In other words, it may form a very juicy target for attackers. Break the IDS and you get access to all network traffic!

A host-based IDS’ drawbacks are that it uses resources at each machine on which it runs and that it sees only a small fraction of the events in the network. On the other hand, it does not suffer as much from evasion problems as it can check the traffic after it has been reassembled by the very network stack of the machine it

is trying to protect. Also, in cases such as IPsec, where packets encrypted and de- crypted in the network layer, the IDS may check the data after decryption. Beside the different locations of an IDS, we also have some choice in how an IDS determines whether something poses a threat. There are two main categories. Signature-based intrusion detection systems use patterns in terms of bytes or se- quences of packets that are symptoms of known attacks. If you know that a UDP packet to port 53 with 10 specific bytes at the start of the payload are part of an exploit E, an IDS can easily scan the network traffic for this pattern and raise an alert when it detects it. The alert is specific: (‘‘I have detected E’’) and has a high confidence (‘‘I know that it is E’’). However, with a signature-based IDS, you only detect threats that are known and for which a signature is available. Alternatively, an IDS may raise an alert if it sees unusual behavior. For instance, a computer that

764 NETWORK SECURITY CHAP. 8

normally only exchanges SMTP and DNS traffic with a few IP addresses, suddenly starts sending HTTP traffic to many completely unknown IP addresses outside the local network. An IDS may classify this as fishy. Since such anomaly-based intrusion detection systems, or anomaly detection systems for short, trigger on any abnormal behavior, they are capable of detecting new attacks as well as old ones. The disadvantage is that the alerts do not carry a lot of explanation. Hearing that ‘‘something unusual happened in the network’’ is much less specific and much less useful than learning that ‘‘the security camera at the gate is now attacked being by the Hajime malware.’’

An IPS (Intrusion Prevention System) should not only detect the attack, but also stop it. In that sense, it is a glorified firewall. For instance, when the IPS sees a packet with the Hajime signature it can drop it on the floor rather than allowing it to reach the security camera. To do so, the IPS should sit on the path towards the target and take decisions about accepting or dropping traffic ‘‘on the fly.’’ In con trast, an IDS may reside elsewhere in the network, as long as we mirror all the traf fic so it sees it. Now you may ask: why bother? Why not simply deploy an IPS and be done with the threats entirely? Part of the answer is the performance: the proc- essing at the IDS determines the speed of the data transfer. If you have very little time, you may not be able to analyze the data very deeply. More importantly, what if you get it wrong? Specifically, what if your IPS decides a connection contains an attack and drops it, even though it is benign? That is really bad if the connection is important, for example, when your business depends on it. It may be better to raise an alert and let someone look into it, to decide if it really was malicious.

In fact, it is important to know how often your IDS or IPS gets it right. If it raises too many false alerts (false positives) you may end up spending a lot of time and money chasing those. If, on the other hand, it plays conservative and often does not raise alerts when attacks do take place (false negatives), attackers may still easily compromise your system. The number of false positives (FPs) and false negatives (FNs) with respect to the true positives (TPs) and true negatives (TNs) determines the usefulness of your protection. We commonly express these proper ties in terms of precision and recall. Precision represents a metric that indicates how many of the alarms that you generated were justified. In mathematical terms: P = TP/(TP + FP). Recall indicates how many of the actual attacks you detected: R = TP/(TP + FN). Sometimes, we combine the two values in what is known as the F-measure: F = 2PR/(P + R). Finally, we are sometimes simply interested in how often an IDS or IPS got things right. In that case, we use the accuracy as a metric: A = (TP + TN)/total.

While it is always true that high values for recall and high precision are better than low ones, the number of false negatives and false positives are typically some- what inversely correlated: if one goes down, the other goes up. However, the trade- off for what acceptable ranges are varies from situation to situation. If you are the

Pentagon, you care deeply about not getting compromised. In that case, you may be willing to chase down a few more false positives, as long as you do not have

SEC. 8.3 FIREWALLS AND INTRUSION DETECTION SYSTEMS 765

many false negatives. If, on the other hand, you are a school, things may be less critical and you may choose to not spend your money on an administrator who spends most of his working days analyzing false alarms.

There is one final thing we need to explain about these metrics to make you appreciate the importance of false positives. We will use an analogy similar to the one introduced by Stefan Axelsson in an influential paper that explained why intru- sion detection is difficult (Axelsson, 1999). Suppose that there is a disease that affects 1 in 100,000 people in practice. Anyone diagnosed with the disease dies within a month. Fortunately, there is a great test to see if someone is infected. The test has 99% accuracy: if a patient is sick (S) the test will be positive (in the medi- cal world a positive test is a bad thing!) in 99% of the cases, while for healthy patients (H), the test will be negative (Neg) in 99% of the cases. One day you take the test and, blow me down, the test is positive (i.e., indicates Pos). The million dollar question: how bad is this? Phrased differently: should you say goodbye to friends and family, sell everything you own in a yard sale, and live a (short) life of debauchery for the remaining 30-odd days? Or not?

To answer this question we should look at the math. What we are interested in is the probability that you have the disease given that you tested positive: P(S|Pos). What we know is:

P(Pos|S) = 0. 99

P(Neg|H) = 0. 99

P(S) = 0. 00001

To calculate P(S|Pos), we use the famous Bayes theorem:

P(S|Pos) = P(S)P(Pos|S)

P(Pos)

In our case, there are only two possible outcomes for the test and two possible out- comes for you having the disease. In other words

P(Pos) = P(S)P(Pos|S) + P(H)P(Pos|H)

where P(H) = 1 < P(S),

and P(Pos|H) = 1 < P(Neg|H), so that:

P(Pos) = P(S)P(Pos|S) + (1 < P(S))(1 < P(Neg|H))

= 0. 00001 * 0. 99 + 0. 99999 * 0. 01

so that

P(S|Pos) =0. 00001 * 0. 99

0. 00001 * 0. 99 + 0. 99999 * 0. 01

= 0. 00098

766 NETWORK SECURITY CHAP. 8

In other words, the probability of you having the disease is less than 0. 1%. No need to panic yet. (Unless of course you did prematurely sell all your belongings in an estate sale.)

What we see here is that the final probability is strongly dominated by the false positive rate P(Pos|H) = 1 < P(Neg|H) = 0. 01. The reason is that the number of incidents is so small (0. 00001) that all the other terms in the equation hardly count. The problem is referred to as the Base Rate Fallacy. If we substitute ‘‘under attack’’ for ‘‘sick,’’ and ‘‘alert’’ for ‘‘positive test,’’ we see that the base rate fallacy is extremely important for any IDS or IPS solution. It motivates the need for keeping the number of false positives low.

Besides the fundamental security principles by Saltzer and Schroeder, many people have offered additional, often very practical principles. One that is particu larly useful to mention here is the pragmatic principle of defense in depth. Often it is a good idea to use multiple complementary techniques to protect a system. For instance, to stop attacks, we may use a firewall and an intrusion detection sys tem and a virus scanner. While no single measure may be foolproof by itself, the idea is that it is much harder to bypass all of them at the same time.

8.4 CRYPTOGRAPHY

Cryptography comes from the Greek words for ‘‘secret writing.’’ It has a long and colorful history going back thousands of years. In this section, we will just sketch some of the highlights, as background information for what follows. For a complete history of cryptography, Kahn’s (1995) book is recommended reading. For a comprehensive treatment of modern security and cryptographic algorithms, protocols, and applications, and related material, see Kaufman et al. (2002). For a more mathematical approach, see Kraft and Washington (2018). For a less mathe- matical approach, see Esposito (2018).

Professionals make a distinction between ciphers and codes. A cipher is a character-for-character or bit-for-bit transformation, without regard to the linguistic structure of the message. In contrast, a code replaces one word with another word or symbol. Codes are not used any more, although they have a glorious history.

The most successful code ever devised was used by the United States Marine Corps during World War II in the Pacific. They simply had Navajo Marines talking to each other in their native language using specific Navajo words for military terms, for example, chay-da-gahi-nail-tsaidi (literally: tortoise killer) for antitank weapon. The Navajo language is highly tonal, exceedingly complex, and has no written form. And not a single person in Japan knew anything about it. In September 1945, the San Diego Union published an article describing the previ- ously secret use of the Navajos to foil the Japanese, telling how effective it was. The Japanese never broke the code and many Navajo code talkers were awarded

SEC. 8.4 CRYPTOGRAPHY 767

high military honors for extraordinary service and bravery. The fact that the U.S. broke the Japanese code but the Japanese never broke the Navajo code played a crucial role in the American victories in the Pacific.

8.4.1 Introduction to Cryptography

Historically, four groups of people have used and contributed to the art of cryptography: the military, the diplomatic corps, diarists, and lovers. Of these, the military has had the most important role and has shaped the field over the cen turies. Within military organizations, the messages to be encrypted have tradition- ally been given to poorly paid, low-level code clerks for encryption and transmis- sion. The sheer volume of messages prevented this work from being done by a few elite specialists.

Until the advent of computers, one of the main constraints on cryptography had been the ability of the code clerk to perform the necessary transformations, often on a battlefield with little equipment. An additional constraint has been the difficulty in switching over quickly from one cryptographic method to another, since this entails retraining a large number of people. However, the danger of a code clerk being captured by the enemy has made it essential to be able to change the cryptographic method instantly if need be. These conflicting requirements have given rise to the model of Fig. 8-9.

Passive intruder just

listens

Intruder

Active intruder

can alter messages

Plaintext, P Plaintext, P Decryption Encryption

method, E

Ciphertext, C = EK(P)

Encryption

key, K

method, D

Decryption key, K

Figure 8-9. The encryption model (for a symmetric-key cipher).

The messages to be encrypted, known as the plaintext, are transformed by a function that is parametrized by a key. The output of the encryption process, known as the ciphertext, is then transmitted, often by messenger or radio. We as- sume that the enemy, or intruder, hears and accurately copies down the complete ciphertext. However, unlike the intended recipient, he does not know what the

768 NETWORK SECURITY CHAP. 8

decryption key is and so cannot decrypt the ciphertext easily. Sometimes the in truder can not only listen to the communication channel (passive intruder) but can also record messages and play them back later, inject his own messages, or modify legitimate messages before they get to the receiver (active intruder). The art of breaking ciphers, known as cryptanalysis, and the art of devising them (crypto- graphy) are collectively known as cryptology.

It will often be useful to have a notation for relating plaintext, ciphertext, and keys. We will use C = EK (P) to mean that the encryption of the plaintext P using key K gives the ciphertext C. Similarly, P = DK(C) represents the decryption of C to get the plaintext again. It then follows that

DK (EK (P)) = P

This notation suggests that E and D are just mathematical functions, which they are. The only tricky part is that both are functions of two parameters, and we have written one of the parameters (the key) as a subscript, rather than as an argument, to distinguish it from the message.

A fundamental rule of cryptography is that one must assume that the crypt- analyst knows the methods used for encryption and decryption. In other words, the cryptanalyst knows how the encryption method, E, and decryption, D, of Fig. 8-9 work in detail. The amount of effort necessary to invent, test, and install a new al- gorithm every time the old method is compromised (or thought to be compro- mised) has always made it impractical to keep the encryption algorithm secret. Thinking it is secret when it is not does more harm than good.

This is where the key enters. The key consists of a (relatively) short string that selects one of many potential encryptions. In contrast to the general method, which may only be changed every few years, the key can be changed as often as re- quired. Thus, our basic model is a stable and publicly known general method parametrized by a secret and easily changed key. The idea that the cryptanalyst knows the algorithms and that the secrecy lies exclusively in the keys is called Kerckhoffs’ principle, named after the Dutch-born military cryptographer Auguste Kerckhoffs who first published it in a military journal 1883 (Kerckhoffs, 1883). Thus, we have

Kerckhof s’ principle: all algorithms must be public; only the keys are secret.

The nonsecrecy of the algorithm cannot be emphasized enough. Trying to keep the algorithm secret, known in the trade as security by obscurity, never works. Also, by publicizing the algorithm, the cryptographer gets free consulting from a large number of academic cryptologists eager to break the system so they can publish papers demonstrating how smart they are. If many experts have tried to break the algorithm for a long time after its publication and no one has suc- ceeded, it is probably pretty solid. (On the other hand, researchers have found bugs in open source security solutions such as OpenSSL that were over a decade

SEC. 8.4 CRYPTOGRAPHY 769

old, so the common belief that ‘‘given enough eyeballs, all bugs are shallow’’ argu- ment does not always work in practice.)

Since the real secrecy is in the key, its length is a major design issue. Consider a simple combination lock. The general principle is that you enter digits in se- quence. Everyone knows this, but the key is secret. A key length of two digits means that there are 100 possibilities. A key length of three digits means 1000 possibilities, and a key length of six digits means a million. The longer the key, the higher the work factor the cryptanalyst has to deal with. The work factor for breaking the system by exhaustive search of the key space is exponential in the key length. Secrecy comes from having a strong (but public) algorithm and a long key. To prevent your kid brother from reading your email, perhaps even 64-bit keys will do. For routine commercial use, perhaps 256 bits should be used. To keep major governments at bay, much larger keys of at least 256 bits, and preferably more are needed. Incidentally, these numbers are for symmetric encryption, where the en- cryption and the decryption key are the same. We will discuss the differences be tween symmetric and asymmetric encryption later.

From the cryptanalyst’s point of view, the cryptanalysis problem has three principal variations. When he has a quantity of ciphertext and no plaintext, he is confronted with the ciphertext-only problem. The cryptograms that appear in the puzzle section of newspapers pose this kind of problem. When the cryptanalyst has some matched ciphertext and plaintext, the problem is called the known plain text problem. Finally, when the cryptanalyst has the ability to encrypt pieces of plaintext of his own choosing, we have the chosen plaintext problem. Newspaper cryptograms could be broken trivially if the cryptanalyst were allowed to ask such questions as ‘‘What is the encryption of ABCDEFGHIJKL?’’

Novices in the cryptography business often assume that if a cipher can with- stand a ciphertext-only attack, it is secure. This assumption is very naive. In many cases, the cryptanalyst can make a good guess at parts of the plaintext. For ex- ample, the first thing many computers say when you boot them up is ‘‘login:’’. Equipped with some matched plaintext-ciphertext pairs, the cryptanalyst’s job be- comes much easier. To achieve security, the cryptographer should be conservative and make sure that the system is unbreakable even if his opponent can encrypt arb itrary amounts of chosen plaintext.

Encryption methods have historically been divided into two categories: substi tution ciphers and transposition ciphers. We will now deal with each of these briefly as background information for modern cryptography.

8.4.2 Two Fundamental Cryptographic Principles

Although we will study many different cryptographic systems in the pages ahead, two principles underlying all of them are important to understand. Pay attention. You violate them at your peril.

770 NETWORK SECURITY CHAP. 8 Redundancy

The first principle is that all encrypted messages must contain some redun- dancy, that is, information not needed to understand the message. An example may make it clear why this is needed. Consider a mail-order company, The Couch Potato (TCP), with 60,000 products. Thinking they are being very efficient, TCP’s programmers decide that ordering messages should consist of a 16-byte customer name followed by a 3-byte data field (1 byte for the quantity and 2 bytes for the product number). The last 3 bytes are to be encrypted using a very long key known only by the customer and TCP.

At first, this might seem secure, and in a sense it is because passive intruders cannot decrypt the messages. Unfortunately, it also has a fatal flaw that renders it useless. Suppose that a recently fired employee wants to punish TCP for firing her. Just before leaving, she takes the customer list with her. She works through the night writing a program to generate fictitious orders using real customer names. Since she does not have the list of keys, she just puts random numbers in the last 3 bytes, and sends hundreds of orders off to TCP.

When these messages arrive, TCP’s computer uses the customers’ name to lo- cate the key and decrypt the message. Unfortunately for TCP, almost every 3-byte message is valid, so the computer begins printing out shipping instructions. While it might seem a bit odd for a customer to order 837 sets of children’s swings or 540 sandboxes, for all the computer knows, the customer might be planning to open a chain of franchised playgrounds. In this way, an active intruder (the ex-employee) can cause a massive amount of trouble, even though she cannot understand the messages her computer is generating.

This problem can be solved by the addition of redundancy to all messages. For example, if order messages are extended to 12 bytes, the first 9 of which must be zeros, this attack no longer works because the ex-employee can no longer generate a large stream of valid messages. The moral of the story is that all messages must contain considerable redundancy so that active intruders cannot send random junk and have it be interpreted as a valid message. Thus we have:

Cryptographic principle 1: Messages must contain some redundancy

However, adding redundancy makes it easier for cryptanalysts to break mes- sages. Suppose that the mail-order business is highly competitive, and The Couch Potato’s main competitor, The Sofa Tuber, would dearly love to know how many sandboxes TCP is selling, so it taps TCP’s phone line. In the original scheme with 3-byte messages, cryptanalysis was nearly impossible because after guessing a key, the cryptanalyst had no way of telling whether it was right because almost every message was technically legal. With the new 12-byte scheme, it is easy for the cryptanalyst to tell a valid message from an invalid one.

In other words, upon decrypting a message, the recipient must be able to tell whether it is valid by simply inspecting the message and perhaps performing a

SEC. 8.4 CRYPTOGRAPHY 771

simple computation. This redundancy is needed to prevent active intruders from sending garbage and tricking the receiver into decrypting the garbage and acting on the ‘‘plaintext.’’

However, this same redundancy makes it much easier for passive intruders to break the system, so there is some tension here. Furthermore, the redundancy should never be in the form of n 0s at the start or end of a message, since running such messages through some cryptographic algorithms gives more predictable re-

sults, making the cryptanalysts’ job easier. A CRC polynomial (see Chapter 3) is much better than a run of 0s since the receiver can easily verify it, but it generates more work for the cryptanalyst. Even better is to use a cryptographic hash, a con- cept we will explore later. For the moment, think of it as a better CRC.

Freshness

The second cryptographic principle is that measures must be taken to ensure that each message received can be verified as being fresh, that is, sent very recently. This measure is needed to prevent active intruders from playing back old messages. If no such measures were taken, our ex-employee could tap TCP’s phone line and just keep repeating previously sent valid messages. Thus,

Cryptographic principle 2: Some method is needed to foil replay attacks

One such measure is including in every message a timestamp valid only for, say, 60 seconds. The receiver can then just keep messages around for 60 seconds and compare newly arrived messages to previous ones to filter out duplicates. Mes- sages older than 60 seconds can be thrown out, since any replays sent more than 60 seconds later will be rejected as too old. The interval should not be too short (e.g., 5 seconds) because the sender’s and receiver’s clocks may be slightly out of sync. Measures other than timestamps will be discussed later.

8.4.3 Substitution Ciphers

In a substitution cipher, each letter or group of letters is replaced by another letter or group of letters to disguise it. One of the oldest known ciphers is the Cae- sar cipher, attributed to Julius Caesar. With this method, a becomes D, b becomes E, c becomes F, . . . , and z becomes C. For example, attack becomes DWWDFN. In our examples, plaintext will be given in lowercase letters, and ciphertext in uppercase letters.

A slight generalization of the Caesar cipher allows the ciphertext alphabet to be shifted by k letters, instead of always three. In this case, k becomes a key to the general method of circularly shifted alphabets. The Caesar cipher may have fooled Pompey, but it has not fooled anyone since.

The next improvement is to have each of the symbols in the plaintext, say, the 26 letters for simplicity, map onto some other letter. For example,

772 NETWORK SECURITY CHAP. 8

plaintext: ciphertext:

a b c d e f g h i j k l m n o p q r s t u v w x y z QW E R T Y U I O P A S D F G H J K L Z X C V B N M

The general system of symbol-for-symbol substitution is called a monoalphabetic substitution cipher, with the key being the 26-letter string corresponding to the full alphabet. For the key just given, the plaintext attack would be transformed into the ciphertext QZZQEA.

At first glance, this might appear to be a safe system because although the cryptanalyst knows the general system (letter-for-letter substitution), he does not know which of the 26! 5 4 × 1026 possible keys is in use. In contrast with the Cae- sar cipher, trying all of them is not a promising approach. Even at 1 nsec per solu tion, a million cores working in parallel would take 10,000 years to try all the keys.

Nevertheless, given a surprisingly small amount of ciphertext, the cipher can be broken easily. The basic attack takes advantage of the statistical properties of natural languages. In English, for example, e is the most common letter, followed by t, o, a, n, i, etc. The most common two-letter combinations, or digrams, are th, in, er, re, and an. The most common three-letter combinations, or trigrams, are the, ing, and, and ion.

A cryptanalyst trying to break a monoalphabetic cipher would start out by counting the relative frequencies of all letters in the ciphertext. Then he might ten tatively assign the most common one to e and the next most common one to t. He would then look at trigrams to find a common one of the form tXe, which strongly suggests that X is h. Similarly, if the pattern thYt occurs frequently, the Y probably stands for a. With this information, he can look for a frequently occurring trigram of the form aZW, which is most likely and. By making guesses at common letters, digrams, and trigrams and knowing about likely patterns of vowels and consonants, the cryptanalyst builds up a tentative plaintext, letter by letter.

Another approach is to guess a probable word or phrase. For example, consid- er the following ciphertext from an accounting firm (blocked into groups of five characters):

CTBMN BYCTC BTJDS QXBNS GSTJC BTSWX CTQTZ CQVUJ QJSGS TJQZZ MNQJS VLNSX VSZJU JDSTS JQUUS JUBXJ DSKSU JSNTK BGAQJ ZBGYQ TLCTZ BNYBN QJSW

A likely word in a message from an accounting firm is financial. Using our know ledge that financial has a repeated letter (i), with four other letters between their occurrences, we look for repeated letters in the ciphertext at this spacing. We find 12 hits, at positions 6, 15, 27, 31, 42, 48, 56, 66, 70, 71, 76, and 82. However, only two of these, 31 and 42, have the next letter (corresponding to n in the plaintext) repeated in the proper place. Of these two, only 31 also has the a correctly posi tioned, so we know that financial begins at position 30. From this point on, deduc ing the key is easy by using the frequency statistics for English text and looking for nearly complete words to finish off.

SEC. 8.4 CRYPTOGRAPHY 773 8.4.4 Transposition Ciphers

Substitution ciphers preserve the order of the plaintext symbols but disguise them. Transposition ciphers, in contrast, reorder the letters but do not disguise them. Figure 8-10 depicts a common transposition cipher, the columnar transposi tion. The cipher is keyed by a word or phrase not containing any repeated letters. In this example, MEGABUCK is the key. The purpose of the key is to order the columns, with column 1 being under the key letter closest to the start of the alpha- bet, and so on. The plaintext is written horizontally, in rows, padded to fill the ma trix if need be. The ciphertext is read out by columns, starting with the column whose key letter is the lowest.

M E G A B U C K

7 4 5 1 2 8 3 6

p l e a s e t r Plaintext

a n s f e r o n e m i l l i o n d o l l a r s t o m y s w i s s b a n k a c c o u n t s i x t w o t w o a b c d

pleasetransferonemilliondollarsto

myswissbankaccountsixtwotwo

Ciphertext

AFLLSKSOSELAWAIATOOSSCTCLNMOMANT ESILYNTWRNNTSOWDPAEDOBUOERIRICXB

Figure 8-10. A transposition cipher.

To break a transposition cipher, the cryptanalyst must first be aware that he is dealing with a transposition cipher. By looking at the frequency of E, T, A, O, I, N, etc., it is easy to see if they fit the normal pattern for plaintext. If so, the cipher is clearly a transposition cipher because in such a cipher every letter represents itself, keeping the frequency distribution intact.

The next step is to make a guess at the number of columns. In many cases, a probable word or phrase may be guessed at from the context. For example, sup- pose that our cryptanalyst suspects that the plaintext phrase milliondollars occurs somewhere in the message. Observe that digrams MO, IL, LL, LA, IR, and OS oc- cur in the ciphertext as a result of this phrase wrapping around. The ciphertext let ter O follows the ciphertext letter M (i.e., they are vertically adjacent in column 4) because they are separated in the probable phrase by a distance equal to the key length. If a key of length seven had been used, the digrams MD, IO, LL, LL, IA, OR, and NS would have occurred instead. In fact, for each key length, a different set of digrams is produced in the ciphertext. By hunting for the various possibili ties, the cryptanalyst can often easily determine the key length.

774 NETWORK SECURITY CHAP. 8

The remaining step is to order the columns. When the number of columns, k, is small, each of the k(k < 1) column pairs can be examined in turn to see if its digram frequencies match those for English plaintext. The pair with the best match is assumed to be correctly positioned. Now each of the remaining columns is ten tatively tried as the successor to this pair. The column whose digram and trigram frequencies give the best match is tentatively assumed to be correct. The next col- umn is found in the same way. The entire process is continued until a potential ordering is found. Chances are that the plaintext will be recognizable at this point (e.g., if milloin occurs, it is clear what the error is).

Some transposition ciphers accept a fixed-length block of input and produce a fixed-length block of output. These ciphers can be completely described by giving a list telling the order in which the characters are to be output. For example, the cipher of Fig. 8-10 can be seen as a 64 character block cipher. Its output is 4, 12, 20, 28, 36, 44, 52, 60, 5, 13, . . . , 62. In other words, the fourth input character, a, is the first to be output, followed by the twelfth, f, and so on.

8.4.5 One-Time Pads

Constructing an unbreakable cipher is actually quite easy; the technique has been known for decades. First, choose a random bit string as the key. Then con- vert the plaintext into a bit string, for example, by using its ASCII representation. Finally, compute the XOR (eXclusive OR) of these two strings, bit by bit. The re- sulting ciphertext cannot be broken because in a sufficiently large sample of ciphertext, each letter will occur equally often, as will every digram, every trigram, and so on. This method, known as the one-time pad, is immune to all present and future attacks, no matter how much computational power the intruder has. The reason derives from information theory: there is simply no information in the mes- sage because all possible plaintexts of the given length are equally likely.

An example of how one-time pads are used is given in Fig. 8-11. First, mes- sage 1, ‘‘I love you.’’ is converted to 7-bit ASCII. Then a one-time pad, pad 1, is chosen and XORed with the message to get the ciphertext. A cryptanalyst could try all possible one-time pads to see what plaintext came out for each one. For ex- ample, the one-time pad listed as pad 2 in the figure could be tried, resulting in plaintext 2, ‘‘Elvis lives,’’ which may or may not be plausible (a subject beyond the scope of this book). In fact, for every 11-character ASCII plaintext, there is a one time pad that generates it. That is what we mean by saying there is no information in the ciphertext: you can get any message of the correct length out of it.

One-time pads are great in theory, but have a number of disadvantages in prac tice. To start with, the key cannot be memorized, so both sender and receiver must carry a written copy with them. If either one is subject to capture, written keys are clearly undesirable. Additionally, the total amount of data that can be transmitted is limited by the amount of key available. If the spy strikes it rich and discovers a wealth of data, he may find himself unable to transmit them back to headquarters

SEC. 8.4 CRYPTOGRAPHY 775

Message 1: 1001001 0100000 1101100 1101111 1110110 1100101 0100000 1111001 1101111 1110101 0101110 Pad 1: 1010010 1001011 1110010 1010101 1010010 1100011 0001011 0101010 1010111 1100110 0101011 Ciphertext: 0011011 1101011 0011110 0111010 0100100 0000110 0101011 1010011 0111000 0010011 0000101

Pad 2: 1011110 0000111 1101000 1010011 1010111 0100110 1000111 0111010 1001110 1110110 1110110 Plaintext 2: 1000101 1101100 1110110 1101001 1110011 0100000 1101100 1101001 1110110 1100101 1110011

Figure 8-11. The use of a one-time pad for encryption and the possibility of get

ting any possible plaintext from the ciphertext by the use of some other pad.

because the key has been used up. Another problem is the sensitivity of the meth- od to lost or inserted characters. If the sender and receiver get out of synchroniza tion, all data from then on will appear garbled.

With the advent of computers, the one-time pad might potentially become practical for some applications. The source of the key could be a special DVD that contains several gigabytes of information and, if transported in a DVD movie box and prefixed by a few minutes of video, would not even be suspicious. Of course, at gigabit network speeds, having to insert a new DVD every 30 sec could become tedious. And the DVDs must be personally carried from the sender to the receiver before any messages can be sent, which greatly reduces their practical utility. Also, given that very soon nobody will use DVD or Blu-Ray discs any more, any- one caught carrying around a box of them should perhaps be regarded with suspi- cion.

Quantum Cryptography

Interestingly, there may be a solution to the problem of how to transmit the one-time pad over the network, and it comes from a very unlikely source: quantum mechanics. This area is still experimental, but initial tests are promising. If it can be perfected and be made efficient, virtually all cryptography will eventually be done using one-time pads since they are provably secure. Below we will briefly explain how this method, quantum cryptography, works. In particular, we will describe a protocol called BB84 after its authors and publication year (Bennet and Brassard, 1984).

Suppose that a user, Alice, wants to establish a one-time pad with a second user, Bob. Alice and Bob are called principals, the main characters in our story. For example, Bob is a banker with whom Alice would like to do business. The names ‘‘Alice’’ and ‘‘Bob’’ have been used for the principals in virtually every paper and book on cryptography since Ron Rivest introduced them many years ago (Rivest et al., 1978). Cryptographers love tradition. If we were to use ‘‘Andy’’ and ‘‘Barbara’’ as the principals, no one would believe anything in this chapter. So be it.

If Alice and Bob could establish a one-time pad, they could use it to communi- cate securely. The obvious question is: how can they establish it without having

776 NETWORK SECURITY CHAP. 8

previously exchanging them physically (using DVDs, books, or USB sticks)? We can assume that Alice and Bob are at the opposite ends of an optical fiber over which they can send and receive light pulses. However, an intrepid intruder, Trudy, can cut the fiber to splice in an active tap. Trudy can read all the bits sent in both directions. She can also send false messages in both directions. The situation might seem hopeless for Alice and Bob, but quantum cryptography can shed some new light on the subject.

Quantum cryptography is based on the fact that light comes in microscopic lit tle packets called photons, which have some peculiar properties. Furthermore, light can be polarized by being passed through a polarizing filter, a fact well known to both sunglasses wearers and photographers. If a beam of light (i.e., a stream of photons) is passed through a polarizing filter, all the photons emerging from it will be polarized in the direction of the filter’s axis (e.g., vertically). If the beam is now passed through a second polarizing filter, the intensity of the light emerging from the second filter is proportional to the square of the cosine of the angle between the axes. If the two axes are perpendicular, no photons get through. The absolute ori- entation of the two filters does not matter; only the angle between their axes counts.

To generate a one-time pad, Alice needs two sets of polarizing filters. Set one consists of a vertical filter and a horizontal filter. This choice is called a rectilin- ear basis. A basis (plural: bases) is just a coordinate system. The second set of filters is the same, except rotated 45 degrees, so one filter runs from the lower left to the upper right and the other filter runs from the upper left to the lower right. This choice is called a diagonal basis. Thus, Alice has two bases, which she can rapidly insert into her beam at will. In reality, Alice does not have four separate filters, but a crystal whose polarization can be switched electrically to any of the four allowed directions at great speed. Bob has the same equipment as Alice. The fact that Alice and Bob each have two bases available is essential to quantum cryptography.

For each basis, Alice now assigns one direction as 0 and the other as 1. In the example presented below, we assume she chooses vertical to be 0 and horizontal to be 1. Independently, she also chooses lower left to upper right as 0 and upper left to lower right as 1. She sends these choices to Bob as plaintext, fully aware that Trudy will be able to read her message.

Now Alice picks a one-time pad, for example, based on a random number gen- erator (a complex subject all by itself). She transfers it bit by bit to Bob, choosing one of her two bases at random for each bit. To send a bit, her photon gun emits one photon polarized appropriately for the basis she is using for that bit. For ex- ample, she might choose bases of diagonal, rectilinear, rectilinear, diagonal, recti linear, etc. To send her one-time pad of 1001110010100110 with these bases, she would send the photons shown in Fig. 8-12(a). Given the one-time pad and the se- quence of bases, the polarization to use for each bit is uniquely determined. Bits sent one photon at a time are called qubits.

SEC. 8.4 CRYPTOGRAPHY 777

Bit

number

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Data

1 0 0 1 1 1 0 0 1 0 1 0 0 1 1 0 What

(a) (b) (c) (d)

Alice sends

Bob's bases

What

Bob gets

Correct

No Yes No Yes No No No Yes Yes No Yes Yes Yes No Yes No basis?

One

(e) (f)

0 1 0 1 1 0 0 1

time pad

Trudy's bases

(g) x 0 x 1 x x x ? 1 x ? ? 0 x ?

Trudy's

x

pad

Figure 8-12. An example of quantum cryptography.

Bob does not know which bases to use, so he picks one at random for each ar riving photon and just uses it, as shown in Fig. 8-12(b). If he picks the correct basis, he gets the correct bit. If he picks the incorrect basis, he gets a random bit because if a photon hits a filter polarized at 45 degrees to its own polarization, it randomly jumps to the polarization of the filter or to a polarization perpendicular to the filter, with equal probability. This property of photons is fundamental to quantum mechanics. Thus, some of the bits are correct and some are random, but Bob does not know which are which. Bob’s results are depicted in Fig. 8-12(c).

How does Bob find out which bases he got right and which he got wrong? He simply tells Alice (in plaintext) which basis he used for each bit in plaintext and she tells him which are right and which are wrong in plaintext, as shown in Fig. 8-12(d). From this information, both of them can build a bit string from the correct guesses, as shown in Fig. 8-12(e). On the average, this bit string will be half the length of the original bit string, but since both parties know it, they can use it as a one-time pad. All Alice has to do is transmit a bit string slightly more than twice the desired length, and she and Bob will have a one-time pad of the desired length. Done.

But wait a minute. We forgot Trudy for the moment. Suppose that she is curi- ous about what Alice has to say and cuts the fiber, inserting her own detector and

778 NETWORK SECURITY CHAP. 8

transmitter. Unfortunately for her, she does not know which basis to use for each photon either. The best she can do is pick one at random for each photon, just as Bob does. An example of her choices is shown in Fig. 8-12(f). When Bob later reports (in plaintext) which bases he used and Alice tells him (in plaintext) which ones are correct, Trudy now knows when she got it right and when she got it wrong. In Fig. 8-12, she got it right for bits 0, 1, 2, 3, 4, 6, 8, 12, and 13. But she knows from Alice’s reply in Fig. 8-12(d) that only bits 1, 3, 7, 8, 10, 11, 12, and 14 are part of the one-time pad. For four of these bits (1, 3, 8, and 12), she guessed right and captured the correct bit. For the other four (7, 10, 11, and 14), she guessed wrong and does not know the bit transmitted. Thus, Bob knows the one time pad starts with 01011001, from Fig. 8-12(e) but all Trudy has is 01?1??0?, from Fig. 8-12(g).

Of course, Alice and Bob are aware that Trudy may have captured part of their one-time pad, so they would like to reduce the information Trudy has. They can do this by performing a transformation on it. For example, they could divide the one-time pad into blocks of 1024 bits, square each one to form a 2048-bit number, and use the concatenation of these 2048-bit numbers as the one-time pad. With her partial knowledge of the bit string transmitted, Trudy has no way to generate its square and so has nothing. The transformation from the original one-time pad to a different one that reduces Trudy’s knowledge is called privacy amplification. In practice, complex transformations in which every output bit depends on every input bit are used instead of squaring.

Poor Trudy. Not only does she have no idea what the one-time pad is, but her presence is not a secret either. After all, she must relay each received bit to Bob to trick him into thinking he is talking to Alice. The trouble is, the best she can do is transmit the qubit she received, using the polarization she used to receive it, and about half the time she will be wrong, causing many errors in Bob’s one-time pad.

When Alice finally starts sending data, she encodes it using a heavy for- ward-error-correcting code. From Bob’s point of view, a 1-bit error in the one-time pad is the same as a 1-bit transmission error. Either way, he gets the wrong bit. If there is enough forward error correction, he can recover the original message despite all the errors, but he can easily count how many errors were corrected. If this number is far more than the expected error rate of the equipment, he knows that Trudy has tapped the line and can act accordingly (e.g., tell Alice to switch to a radio channel, call the police, etc.). If Trudy had a way to clone a photon so she had one photon to inspect and an identical photon to send to Bob, she could avoid detection, but at present no way to clone a photon perfectly is known. And even if Trudy could clone photons, the value of quantum cryptography to establish one time pads would not be reduced.

Although quantum cryptography has been shown to operate over distances of 60 km of fiber, the equipment is complex and expensive. Still, the idea has promise if it can be made to scale up and become cheaper. For more information about quantum cryptography, see Clancy et al. (2019).

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 779 8.5 SYMMETRIC-KEY ALGORITHMS

Modern cryptography uses the same basic ideas as traditional cryptography (transposition and substitution), but its emphasis is different. Traditionally, cryp tographers have used simple algorithms. Nowadays, the reverse is true: the object is to make the encryption algorithm so complex and involuted that even if the cryptanalyst acquires vast mounds of enciphered text of his own choosing, he will not be able to make any sense of it at all without the key.

The first class of encryption algorithms we will study in this chapter are called symmetric-key algorithms because they use the same key for encryption and de- cryption. Fig. 8-9 illustrates the use of a symmetric-key algorithm. In particular, we will focus on block ciphers, which take an n-bit block of plaintext as input and transform it using the key into an n-bit block of ciphertext.

Cryptographic algorithms can be implemented in either hardware (for speed) or software (for flexibility). Although most of our treatment concerns the algo rithms and protocols, which are independent of the actual implementation, a few words about building cryptographic hardware may be of interest. Transpositions and substitutions can be implemented with simple electrical circuits. Figure 8-13(a) shows a device, known as a P-box (P stands for permutation), used to ef fect a transposition on an 8-bit input. If the 8 bits are designated from top to bot tom as 01234567, the output of this particular P-box is 36071245. By appropriate internal wiring, a P-box can be made to perform any transposition and do it at prac tically the speed of light since no computation is involved, just signal propagation. This design follows Kerckhoffs’ principle: the attacker knows that the general method is permuting the bits. What he does not know is which bit goes where.

P-box

P-box

Product cipher

8

o

3

o

S1

S5

S9

S2 P1 P2 P3 P4 S3

t

3

:

r

e

d

o

c

e

D

(a)

t

8

:

r

e

d

o

c

n

E

(b)

S4

S6 S7 S8 (c)

S10 S11 S12

Figure 8-13. Basic elements of product ciphers. (a) P-box. (b) S-box. (c) Product.

Substitutions are performed by S-boxes, as shown in Fig. 8-13(b). In this ex- ample, a 3-bit plaintext is entered and a 3-bit ciphertext is output. The 3-bit input selects one of the eight lines exiting from the first stage and sets it to 1; all the other lines are 0. The second stage is a P-box. The third stage encodes the selec ted input line in binary again. With the wiring shown, if the eight octal numbers 01234567 were input one after another, the output sequence would be 24506713.

780 NETWORK SECURITY CHAP. 8

In other words, 0 has been replaced by 2, 1 has been replaced by 4, etc. Again, by appropriate wiring of the P-box inside the S-box, any substitution can be accom- plished. Furthermore, such a device can be built in hardware to achieve great speed, since encoders and decoders have only one or two (subnanosecond) gate delays and the propagation time across the P-box may well be less than 1 picosec.

The real power of these basic elements only becomes apparent when we cas- cade a whole series of boxes to form a product cipher, as shown in Fig. 8-13(c). In this example, 12 input lines are transposed (i.e., permuted) by the first stage (P1). In the second stage, the input is broken up into four groups of 3 bits, each of which is substituted independently of the others (S1to S4). This arrangement shows a method of approximating a larger S-box from multiple, smaller S-boxes. It is useful because small S-boxes are practical for a hardware implementation (e.g., an 8-bit S-box can be realized as a 256-entry lookup table), but large S-boxes become quite unwieldy to build (e.g., a 12-bit S-box would at a minimum need 212 = 4096 crossed wires in its middle stage). Although this method is less gener- al, it is still powerful. By including a sufficiently large number of stages in the product cipher, the output can be made to be an exceedingly complicated function of the input.

Product ciphers that operate on k-bit inputs to produce k-bit outputs are com- mon. One common value for k is 256. A hardware implementation usually has at least 20 physical stages, instead of just 7 as in Fig. 8-13(c). A software imple- mentation has a loop with at least eight iterations, each one performing S-box-type substitutions on subblocks of the 64- to 256-bit data block, followed by a permuta tion that mixes the outputs of the S-boxes. Often there is a special initial permuta tion and one at the end as well. In the literature, the iterations are called rounds.

8.5.1 The Data Encryption Standard

In January 1977 the U.S. Government adopted a product cipher developed by IBM as its official standard for unclassified information. This cipher, DES (Data Encryption Standard), was widely adopted by the industry for use in security products. It is no longer secure in its original form, but in a modified form it is still

used here and there. The original version was controversial because IBM specified a 128-bit key but after discussions with NSA, IBM ‘‘voluntarily’’ decided to re- duce the key length to 56 bits, which cryptographers at the time said was too small.

DES operates essentially as shown in Fig. 8-13(c), but on bigger units. The plaintext (in binary) is broken up into 64-bit units, and each one is encrypted sepa rately by doing permutations and substitutions parametrized by the 56-bit key on each of 16 consecutive rounds. In effect, it is a gigantic monoalphabetic substitu tion cipher on an alphabet with 64-bit characters (about which more shortly).

As early as 1979, IBM realized that 56 bits was much too short and devised a backward compatible scheme to increase the key length by having two 56-bit keys

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 781

used at once, for a total of 112 bits worth of key (Tuchman, 1979). The new scheme, called Triple DES is still in use and works like this.

K1

K2

K1

K1

K2

K1

P E C

E

D

C D P

D

E

(a) (b)

Figure 8-14. (a) Triple encryption using DES. (b) Decryption.

Obvious questions are: (1) Why two keys instead of three? and (2) Why en- cryption-decryption-encryption? The answer to both is that if a computer that uses triple DES has to talk to one that uses only single DES, it can set both keys to the same value and then apply triple DES to give the same result as single DES. This design made it easier to phase in triple DES. It is basically obsolete now, but still in use in some change-resistant applications.

8.5.2 The Advanced Encryption Standard

As DES began approaching the end of its useful life, even with triple DES, NIST (National Institute of Standards and Technology), the agency of the U.S. Dept. of Commerce charged with approving standards for the U.S. Federal Govern- ment, decided that the government needed a new cryptographic standard for unclassified use. NIST was keenly aware of all the controversy surrounding DES and well knew that if it just announced a new standard, everyone knowing anything about cryptography would automatically assume that NSA had built a back door into it so NSA could read everything encrypted with it. Under these conditions, probably no one would use the standard and it would have died quietly.

So, NIST took a surprisingly different approach for a government bureaucracy: it sponsored a cryptographic bake-off (contest). In January 1997, researchers from all over the world were invited to submit proposals for a new standard, to be called AES (Advanced Encryption Standard). The bake-off rules were:

1. The algorithm must be a symmetric block cipher.

2. The full design must be public.

3. Key lengths of 128, 192, and 256 bits must be supported.

4. Both software and hardware implementations must be possible.

5. The algorithm must be public or licensed on nondiscriminatory terms.

Fifteen serious proposals were made, and public conferences were organized in which they were presented and attendees were actively encouraged to find flaws in

782 NETWORK SECURITY CHAP. 8

all of them. In August 1998, NIST selected five finalists, primarily on the basis of their security, efficiency, simplicity, flexibility, and memory requirements (impor tant for embedded systems). More conferences were held and more potshots taken at the contestants.

In October 2000, NIST announced that it had selected Rijndael, invented by Joan Daemen and Vincent Rijmen. The name Rijndael, pronounced Rhine-doll (more or less), is derived from the last names of the authors: Rijmen + Daemen. In November 2001, Rijndael became the AES U.S. Government standard, published as FIPS (Federal Information Processing Standard) 197. Owing to the extraordin- ary openness of the competition, the technical properties of Rijndael, and the fact that the winning team consisted of two young Belgian cryptographers (who were unlikely to have built in a back door just to please NSA), Rijndael has become the world’s dominant cryptographic cipher. AES encryption and decryption is now part of the instruction set for some CPUs.

Rijndael supports key lengths and block sizes from 128 bits to 256 bits in steps of 32 bits. The key length and block length may be chosen independently. How- ever, AES specifies that the block size must be 128 bits and the key length must be 128, 192, or 256 bits. It is doubtful that anyone will ever use 192-bit keys, so de facto, AES has two variants: a 128-bit block with a 128-bit key and a 128-bit block with a 256-bit key.

In our treatment of the algorithm, we will examine only the 128/128 case be- cause this is the commercial norm. A 128-bit key gives a key space of 2128 5 3 × 1038 keys. Even if NSA manages to build a machine with 1 billion par- allel processors, each being able to evaluate one key per picosecond, it would take

such a machine about 1010 years to search the key space. By then the sun will have burned out, so the folks then present will have to read the results by candlelight.

Rijndael

From a mathematical perspective, Rijndael is based on Galois field theory, which gives it some provable security properties. However, it can also be viewed as C code, without getting into the mathematics.

Like DES, Rijndael uses both substitution and permutations, and it also uses multiple rounds. The number of rounds depends on the key size and block size, being 10 for 128-bit keys with 128-bit blocks and moving up to 14 for the largest key or the largest block. However, unlike DES, all operations involve an integral number of bytes, to allow for efficient implementations in both hardware and soft- ware. DES is bit oriented and software implementations are slow as a result.

The algorithm has been designed not only for great security, but also for great speed. A good software implementation on a 2-GHz machine should be able to achieve an encryption rate of 700 Mbps, which is fast enough to encrypt over a dozen 4K videos in real time. Hardware implementations are faster still.

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 783 8.5.3 Cipher Modes

Despite all this complexity, AES (or DES, or any block cipher for that matter) is basically a monoalphabetic substitution cipher using big characters (128-bit characters for AES and 64-bit characters for DES). Whenever the same plaintext block goes in the front end, the same ciphertext block comes out the back end. If you encrypt the plaintext abcdefgh 100 times with the same DES or AES key, you get the same ciphertext 100 times. An intruder can exploit this property to help subvert the cipher.

Electronic Code Book Mode

To see how this monoalphabetic substitution cipher property can be used to partially defeat the cipher, we will use (triple) DES because it is easier to depict 64-bit blocks than 128-bit blocks, but AES has exactly the same problem. The straightforward way to use DES to encrypt a long piece of plaintext is to break it up into consecutive 8-byte (64-bit) blocks and encrypt them one after another with the same key. The last piece of plaintext is padded out to 64 bits, if need be. This technique is known as ECB mode (Electronic Code Book mode) in analogy with old-fashioned code books where each plaintext word was listed, followed by its ciphertext (usually a five-digit decimal number).

In Fig. 8-15, we have the start of a computer file listing the annual bonuses a company has decided to award to its employees. This file consists of consecutive 32-byte records, one per employee, in the format shown: 16 bytes for the name, 8 bytes for the position, and 8 bytes for the bonus. Each of the sixteen 8-byte blocks (numbered from 0 to 15) is encrypted by (triple) DES.

Name Position Bonus

A d a m s , L e s l i e C l e r k $ 1 0 B l a c k , R o b i n B o s s $ 5 0 0 , 0 0 0 C o l l i n s , K i m M a n a g e r $ 1 0 0 , 0 0 0

D a v i s , B o b b i e J a n i t o r $ 5 Bytes 16 8 8 Figure 8-15. The plaintext of a file encrypted as 16 DES blocks.

Leslie just had a fight with the boss and is not expecting much of a bonus. Kim, in contrast, is the boss’ favorite, and everyone knows this. Leslie can get ac- cess to the file after it is encrypted but before it is sent to the bank. Can Leslie rec tify this unfair situation, given only the encrypted file?

No problem at all. All Leslie has to do is make a copy of the 12th ciphertext block (which contains Kim’s bonus) and use it to replace the fourth ciphertext

784 NETWORK SECURITY CHAP. 8

block (which contains Leslie’s bonus). Even without knowing what the 12th block says, Leslie can expect to have a much merrier Christmas this year. (Copying the eighth ciphertext block is also a possibility, but is more likely to be detected; besides, Leslie is not a greedy person.)

Cipher Block Chaining Mode

To thwart this type of attack, all block ciphers can be chained in various ways so that replacing a block the way Leslie did will cause the plaintext decrypted start ing at the replaced block to be garbage. One method to do so is cipher block chaining. In this method, shown in Fig. 8-16, each plaintext block is XORed with the previous ciphertext block before being encrypted. Consequently, the same plaintext block no longer maps onto the same ciphertext block, and the encryption is no longer a big monoalphabetic substitution cipher. The first block is XORed with a randomly chosen IV (Initialization Vector), which is transmitted (in plain text) along with the ciphertext.

P0

P1

P2

P3

C0

C1

C2

C3

+

+

IV

+ +

Key

D

D

D

D

Encryption box

Decryption box

Key

E

E

E

E

+ + + + IV

Exclusive

C0

C1

C2

C3

P0

P1

P2

OR

P3

(a) (b)

Figure 8-16. Cipher block chaining. (a) Encryption. (b) Decryption.

We can see how cipher block chaining mode works by examining the example of Fig. 8-16. We start out by computing C0 = E(P0 XOR IV ). Then we compute C1 = E(P1 XOR C0), and so on. Decryption also uses XOR to reverse the process, with P0 = IV XOR D(C0), and so on. Note that the encryption of block i is a function of all the plaintext in blocks 0 through i < 1, so the same plaintext gener-

ates different ciphertext depending on where it occurs. A transformation of the type Leslie made will result in nonsense for two blocks starting at Leslie’s bonus field. To an astute security officer, this peculiarity might suggest where to start the ensuing investigation.

Cipher block chaining also has the advantage that the same plaintext block will not result in the same ciphertext block, making cryptanalysis more difficult. In fact, this is the main reason it is used.

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 785 Cipher Feedback Mode

However, cipher block chaining has the disadvantage of requiring an entire 64-bit block to arrive before decryption can begin. For byte-by-byte encryption, cipher feedback mode using (triple) DES is used, as shown in Fig. 8-17. For AES, the idea is exactly the same, only a 128-bit shift register is used. In this fig- ure, the state of the encryption machine is shown after bytes 0 through 9 have been encrypted and sent. When plaintext byte 10 arrives, as illustrated in Fig. 8-17(a), the DES algorithm operates on the 64-bit shift register to generate a 64-bit cipher text. The leftmost byte of that ciphertext is extracted and XORed with P10. That byte is transmitted on the transmission line. In addition, the shift register is shifted left 8 bits, causing C2to fall off the left end, and C10is inserted in the position just vacated at the right end by C9.

64-bit shift register

C2 C3 C4 C5 C6 C7 C8 C9

64-bit shift register

C2 C3 C4 C5 C6 C7 C8 C9

Key

Encryption E

box

Key

Encryption E

box

Select

C10 C10 Select

leftmost byte

+

P10 C10 Exclusive OR

(a)

leftmost byte

+

C10 P10 (b)

Figure 8-17. Cipher feedback mode. (a) Encryption. (b) Decryption.

Note that the contents of the shift register depend on the entire previous history of the plaintext, so a pattern that repeats multiple times in the plaintext will be en- crypted differently each time in the ciphertext. As with cipher block chaining, an initialization vector is needed to start the ball rolling.

Decryption with cipher feedback mode works the same way as encryption. In particular, the content of the shift register is encrypted, not decrypted, so the selec ted byte that is XORed with C10to get P10is the same one that was XORed with P10to generate C10in the first place. As long as the two shift registers remain identical, decryption works correctly. This is illustrated in Fig. 8-17(b).

A problem with cipher feedback mode is that if one bit of the ciphertext is ac- cidentally inverted during transmission, the 8 bytes that are decrypted while the bad byte is in the shift register will be corrupted. Once the bad byte is pushed out of the shift register, correct plaintext will once again be generated thereafter. Thus,

786 NETWORK SECURITY CHAP. 8

the effects of a single inverted bit are relatively localized and do not ruin the rest of the message, but they do ruin as many bits as the shift register is wide.

Stream Cipher Mode

Nevertheless, applications exist in which having a 1-bit transmission error mess up 64 bits of plaintext is too large an effect. For these applications, a fourth option, stream cipher mode, exists. It works by encrypting an initialization vector (IV), using a key to get an output block. The output block is then encrypted, using the key to get a second output block. This block is then encrypted to get a third block, and so on. The (arbitrarily large) sequence of output blocks, called the keystream, is treated like a one-time pad and XORed with the plaintext to get the ciphertext, as shown in Fig. 8-18(a). Note that the IV is used only on the first step. After that, the output is encrypted. Also, note that the keystream is independent of the data, so it can be computed in advance, if need be, and is completely insensitive to transmission errors. Decryption is shown in Fig. 8-18(b).

Key

IV

Encryption box E

Keystream +

Key

IV

Encryption box E

Keystream +

Plaintext Ciphertext (a)

Ciphertext Plaintext (b)

Figure 8-18. A stream cipher. (a) Encryption. (b) Decryption.

Decryption occurs by generating the same keystream at the receiving side. Since the keystream depends only on the IV and the key, it is not affected by trans- mission errors in the ciphertext. Thus, a 1-bit error in the transmitted ciphertext generates only a 1-bit error in the decrypted plaintext.

It is essential never to use the same (key, IV) pair twice with a stream cipher because doing so will generate the same keystream each time. Using the same keystream twice exposes the ciphertext to a keystream reuse attack. Imagine that the plaintext block, P0, is encrypted with the keystream to get P0 XOR K0. Later, a second plaintext block, Q0, is encrypted with the same keystream to get Q0 XOR K0. An intruder who captures both of these ciphertext blocks can simply XOR them together to get P0 XOR Q0, which eliminates the key. The intruder now has the XOR of the two plaintext blocks. If one of them is known or can be reasonably guessed, the other can also be found. In any event, the XOR of two plaintext streams can be attacked by using statistical properties of the message.

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 787

For example, for English text, the most common character in the stream will proba- bly be the XOR of two spaces, followed by the XOR of space and the letter ‘‘e’’ and so on. In short, equipped with the XOR of two plaintexts, the cryptanalyst has an excellent chance of deducing both of them.

8.6 PUBLIC-KEY ALGORITHMS

Historically, distributing the keys has always been the weakest link in most cryptosystems. No matter how strong a cryptosystem was, if an intruder could steal the key, the system was worthless. Cryptologists always took for granted that the encryption key and decryption key were the same (or easily derived from one another). But the key had to be distributed to all users of the system. Thus, it seemed as if there was an inherent problem. Keys had to be protected from theft, but they also had to be distributed, so they could not be locked in a bank vault.

In 1976, two researchers at Stanford University, Diffie and Hellman (1976), proposed a radically new kind of cryptosystem, one in which the encryption and decryption keys were so different that the decryption key could not feasibly be derived from the encryption key. In their proposal, the (keyed) encryption algo rithm, E, and the (keyed) decryption algorithm, D, had to meet three requirements. These requirements can be stated simply as follows:

1. D(E(P)) = P.

2. It is exceedingly difficult to deduce D from E.

3. E cannot be broken by a chosen plaintext attack.

The first requirement says that if we apply D to an encrypted message, E(P), we get the original plaintext message, P, back. Without this property, the legitimate receiver could not decrypt the ciphertext. The second requirement speaks for itself. The third requirement is needed because, as we shall see in a moment, intruders may experiment with the algorithm to their hearts’ content. Under these condi tions, there is no reason that the encryption key cannot be made public.

The method works like this. A person, say, Alice, who wants to receive secret messages, first devises two algorithms meeting the above requirements. The en- cryption algorithm and Alice’s key are then made public, hence the name public- key cryptography. Alice might put her public key on her home page on the Web, for example. We will use the notation EAto mean the encryption algorithm parametrized by Alice’s public key. Similarly, the (secret) decryption algorithm parameterized by Alice’s private key is DA. Bob does the same thing, publicizing EB but keeping DBsecret. Now let us see if we can solve the problem of establishing a secure channel be tween Alice and Bob, who have never had any previous contact. Both Alice’s en- cryption key, EA, and Bob’s encryption key, EB, are assumed to be in publicly

788 NETWORK SECURITY CHAP. 8

readable files. Now Alice takes her first message, P, computes EB (P), and sends it to Bob. Bob then decrypts it by applying his secret key DB[i.e., he computes DB(EB(P)) = P]. No one else can read the encrypted message, EB(P), because the encryption system is assumed to be strong and because it is too difficult to derive DBfrom the publicly known EB. To send a reply, R, Bob transmits EA(R). Alice and Bob can now communicate securely.

A note on terminology is perhaps useful here. Public-key cryptography re- quires each user to have two keys: a public key, used by the entire world for en- crypting messages to be sent to that user, and a private key, which the user needs for decrypting messages. We will consistently refer to these keys as the public and private keys, respectively, and distinguish them from the secret keys used for con- ventional symmetric-key cryptography.

8.6.1 RSA

The only catch is that we need to find algorithms that indeed satisfy all three requirements. Due to the potential advantages of public-key cryptography, many researchers are hard at work, and some algorithms have already been published. One good method was discovered by a group at M.I.T. (Rivest et al., 1978). It is known by the initials of the three discoverers (Rivest, Shamir, Adleman): RSA. It has survived all attempts to break it for more than 40 years and is considered very strong. Much practical security is based on it. For this reason, Rivest, Shamir, and Adleman were given the 2002 ACM Turing Award. Its major disadvantage is that it requires keys of at least 2048 bits for good security (versus 256 bits for symmet ric-key algorithms), which makes it quite slow.

The RSA method is based on some principles from number theory. We will now summarize how to use the method; for details, consult their paper.

1. Choose two large primes, p and q (say, 1024 bits).

2. n = p × q and z = (p < 1) × (q < 1).

3. Choose a number relatively prime to z and call it d.

4. Find e such that e × d = 1 mod z.

With these parameters computed in advance, we are ready to begin encryption. Divide the plaintext (regarded as a bit string) into blocks, so that each plaintext message, P, falls in the interval 0 ) P < n. Do that by grouping the plaintext into k < n is true.

blocks of k bits, where k is the largest integer for which 2

To encrypt a message, P, compute C = Pe (mod n). To decrypt C, compute P = Cd(mod n). It can be proven that for all P in the specified range, the en- cryption and decryption functions are inverses. To perform the encryption, you need e and n. To perform the decryption, you need d and n. Therefore, the public key consists of the pair (e, n) and the private key consists of (d, n).

SEC. 8.6 PUBLIC-KEY ALGORITHMS 789

The security of the method is based on the difficulty of factoring large num- bers. If the cryptanalyst could factor the (publicly known) n, he could then find p and q, and from these z. Equipped with knowledge of z and e, d can be found using Euclid’s algorithm. Fortunately, mathematicians have been trying to factor large numbers for at least 300 years, and the accumulated evidence suggests that it is an exceedingly difficult problem.

At the time, Rivest and colleagues concluded that factoring a 500-digit num- 25 years using brute force. In both cases, they assumed the

ber would require 10

best-known algorithm and a computer with a 1-µsec instruction time. With a mil lion chips running in parallel, each with an instruction time of 1 nsec, it would still 16 years. Even if computers continue to get faster by an order of magnitude

take 10

per decade, it will be many years before factoring a 500-digit number becomes fea- sible, at which time our descendants can simply choose p and q still larger. Howev- er, it will probably not come as a surprise that the attacks have made progress and are now significantly faster.

A trivial pedagogical example of how the RSA algorithm works is given in Fig. 8-19. For this example, we have chosen p = 3 and q = 11, giving n = 33 and z = 20 (since(3 < 1) × (11 < 1) = 20). A suitable value for d is d = 7, since 7 and 20 have no common factors. With these choices, e can be found by solving the

equation 7e = 1 (mod 20), which yields e = 3. The ciphertext, C, corresponding to a plaintext message, P, is given by C = P3(mod 33). The ciphertext is de- crypted by the receiver by making use of the rule P = C7(mod 33). The figure shows the encryption of the plaintext ‘‘SUZANNE’’ as an example.

Plaintext (P) Ciphertext (C) After decryption

Symbolic

Numeric

P3

P3 (mod 33) C7 (mod 33)

S

19

U

21

6859 9261

28 21

C7

13492928512 1801088541

19 21

Symbolic

S

U

Z

26

17576

20

1280000000

26

Z

A

01

N

14

N

14

E

05

1

2744 2744 125

1 5 5 26

1

78125

78125

8031810176

01 14 14 05

A N N E

Sender's computation Receiver's computation

Figure 8-19. An example of the RSA algorithm.

Because the primes chosen for this example are so small, P must be less than 33, so each plaintext block can contain only a single character. The result is a monoalphabetic substitution cipher, not very impressive. If instead we had chosen p and q 5 2512, we would have n 5 21024, so each block could be up to 1024 bits or 128 eight-bit characters, versus 8 characters for DES and 16 characters for AES.

790 NETWORK SECURITY CHAP. 8

It should be pointed out that using RSA as we have described is similar to using a symmetric algorithm in ECB mode—the same input block gives the same output block. Therefore, some form of chaining is needed for data encryption. However, in practice, most RSA-based systems use public-key cryptography pri- marily for distributing one-time 128- or 256-bit session keys for use with some symmetric-key algorithm such as AES. RSA is too slow for actually encrypting large volumes of data but is widely used for key distribution.

8.6.2 Other Public-Key Algorithms

Although RSA is still widely used, it is by no means the only public-key algo rithm known. The first public-key algorithm was the knapsack algorithm (Merkle and Hellman, 1978). The idea here is that someone owns a very large number of objects, each with a different weight. The owner encodes the message by secretly selecting a subset of the objects and placing them in the knapsack. The total weight of the objects in the knapsack is made public, as is the list of all possible objects and their corresponding weights. The list of objects in the knapsack is kept secret. With certain additional restrictions, the problem of figuring out a possible list of objects with the given weight was thought to be computationally infeasible and formed the basis of the public-key algorithm.

The algorithm’s inventor, Ralph Merkle, was quite sure that this algorithm could not be broken, so he offered a $100 reward to anyone who could break it. Adi Shamir (the ‘‘S’’ in RSA) promptly broke it and collected the reward. Unde terred, Merkle strengthened the algorithm and offered a $1000 reward to anyone who could break the new one. Ronald Rivest (the ‘‘R’’ in RSA) promptly broke the new one and collected the reward. Merkle did not dare offer $10,000 for the next version, so ‘‘A’’ (Leonard Adleman) was out of luck. Nevertheless, the knap- sack algorithm is not considered secure and is not used in practice any more.

Other public-key schemes are based on the difficulty of computing discrete logarithms or on elliptic curves (Menezes and Vanstone, 1993). Algorithms that use discrete algorithms have been invented by El Gamal (1985) and Schnorr (1991). Elliptic curves, meanwhile are based on a branch of mathematics that is not so well-known except among the elliptic curve illuminati.

A few other schemes exist, but those based on the difficulty of factoring large numbers, computing discrete logarithms modulo a large prime, and elliptic curves, are by far the most important. These problems are thought to be genuinely difficult to solve—mathematicians have been working on them for many years without any great breakthroughs. Elliptic curves in particular enjoy a lot of interest because the elliptic curve discrete algorithm problems are even harder than those of factoriza tion. The Dutch mathematician Arjen Lenstra proposed a way to compare crypto- graphic algorithms by computing how much energy you need to break them. According to this calculation, breaking a 228-bit RSA key takes the energy equiv- alent to that needed to boil less than a teaspoon of water. Breaking an elliptic curve

SEC. 8.6 PUBLIC-KEY ALGORITHMS 791

of that length would require as much energy as you would need to boil all the wa ter on the planet. Paraphrasing Lenstra: with all water evaporated, including that in the bodies of would-be code breakers, the problem would run out of steam.

8.7 DIGITAL SIGNATURES

The authenticity of many legal, financial, and other documents is determined by the presence or absence of an authorized handwritten signature. And photocop ies do not count. For computerized message systems to replace the physical tran- sport of paper-and-ink documents, a method must be found to allow documents to be signed in an unforgeable way.

The problem of devising a replacement for handwritten signatures is a difficult one. Basically, what is needed is a system by which one party can send a signed message to another party in such a way that the following conditions hold:

1. The receiver can verify the claimed identity of the sender.

2. The sender cannot later repudiate the contents of the message. 3. The receiver cannot possibly have concocted the message himself.

The first requirement is needed, for example, in financial systems. When a cus tomer’s computer orders a bank’s computer to buy a ton of gold, the bank’s com- puter needs to be able to make sure that the computer giving the order really be longs to the customer whose account is to be debited. In other words, the bank has to authenticate the customer (and the customer has to authenticate the bank).

The second requirement is needed to protect the bank against fraud. Suppose that the bank buys the ton of gold, and immediately thereafter the price of gold drops sharply. A dishonest customer might then proceed to sue the bank, claiming that he never issued any order to buy gold. When the bank produces the message in court, the customer may deny having sent it. The property that no party to a contract can later deny having signed it is called nonrepudiation. The digital sig- nature schemes that we will now study help provide it.

The third requirement is needed to protect the customer in the event that the price of gold shoots up and the bank tries to construct a signed message in which the customer asked for one bar of gold instead of one ton. In this fraud scenario, the bank just keeps the rest of the gold for itself.

8.7.1 Symmetric-Key Signatures

One approach to digital signatures is to have a central authority that knows everything and whom everyone trusts, say, Big Brother (BB). Each user then chooses a secret key and carries it by hand to BB’s office. Thus, only Alice and BB

792 NETWORK SECURITY CHAP. 8

know Alice’s secret key, KA, and so on. In case you get lost with all notations, with symbols and subscripts, have a look at Fig. 8-20 which summarizes the most im- portant notations for this and subsequent sections.

Term Description

A Alice (sender)

B Bob the Banker (recipient)

P Plaintext message Alice wants to send

BB Big Brother (a trusted central authority)

t Timestamp (to ensure freshness)

RA Random number chosen by Alice

Symmetric key

KA Alice’s secret key (analogous for KB, KBB, etc.)

KA(M ) Message M encrypted/decrypted with Alice’s secret key

Asymmetric keys

DA Alice’s private key (analogous for DB, etc.)

EA Alice’s public key (analogous for EB, etc.)

DA(M) Message M encrypted/decrypted with Alice’s private key

EA(M) Message M encrypted/decrypted with Alice’s public key

Digest

MD(P) Message Digest of plaintext P)

Figure 8-20. Alice wants to send a message to her banker: a legend to keys and symbols

When Alice wants to send a signed plaintext message, P, to her banker, Bob, she generates KA(B, RA, t, P), where B is Bob’s identity, RAis a random number chosen by Alice, t is a timestamp to ensure freshness, and KA(B, RA, t, P) is the message encrypted with her key, KA. Then she sends it as depicted in Fig. 8-21. BB sees that the message is from Alice, decrypts it, and sends a message to Bob as shown. The message to Bob contains the plaintext of Alice’s message and also the signed message KBB(A, t, P). Bob now carries out Alice’s request.

1

A, KA(B, RA, t, P)

e

c

i

l

A

B

B

2

KB (A, RA, t, P, KBB (A, t, P))

Figure 8-21. Digital signatures with Big Brother.

b o

B

What happens if Alice later denies sending the message? Step 1 is that every- one sues everyone (at least, in the United States). Finally, when the case comes to court and Alice vigorously denies sending Bob the disputed message, the judge

SEC. 8.7 DIGITAL SIGNATURES 793

will ask Bob how he can be sure that the disputed message came from Alice and not from Trudy. Bob first points out that BB will not accept a message from Alice unless it is encrypted with KA, so there is no possibility of Trudy sending BB a false message from Alice without BB detecting it immediately.

Bob then dramatically produces Exhibit A: KBB(A, t, P). Bob says that this is a message signed by BB that proves Alice sent P to Bob. The judge then asks BB (whom everyone trusts) to decrypt Exhibit A. When BB testifies that Bob is telling the truth, the judge decides in favor of Bob. Case dismissed.

One potential problem with the signature protocol of Fig. 8-21 is Trudy replay ing either message. To minimize this problem, timestamps are used throughout. Furthermore, Bob can check all recent messages to see if RA was used in any of them. If so, the message is discarded as a replay. Note that based on the time- stamp, Bob will reject very old messages. To guard against instant replay attacks,

Bob just checks the RA of every incoming message to see if such a message has been received from Alice in the past hour. If not, Bob can safely assume this is a new request.

8.7.2 Public-Key Signatures

A structural problem with using symmetric-key cryptography for digital signa tures is that everyone has to agree to trust Big Brother. Furthermore, Big Brother gets to read all signed messages. The most logical candidates for running the Big Brother server are the government, the banks, the accountants, and the lawyers. Unfortunately, none of these inspire total confidence in all citizens. Hence, it would be nice if signing documents did not require a trusted authority.

Fortunately, public-key cryptography can make an important contribution in this area. Let us assume that the public-key encryption and decryption algorithms have the property that E(D(P)) = P, in addition, of course, to the usual property that D(E(P)) = P. (RSA has this property, so the assumption is not unreasonable.) Assuming that this is the case, Alice can send a signed plaintext message, P, to Bob by transmitting EB(DA(P)). Note carefully that Alice knows her own (pri- vate) key, DA, as well as Bob’s public key, EB, so constructing this message is something Alice can do.

When Bob receives the message, he transforms it using his private key, as usual, yielding DA(P), as shown in Fig. 8-22. He stores this text in a safe place and then applies EA to get the original plaintext. To see how the signature property works, suppose that Alice subsequently denies having sent the message P to Bob. When the case comes up in court, Bob can produce both P and DA(P). The judge can easily verify that Bob indeed has a valid message encrypted by DA by simply applying EAto it. Since Bob does not know what Alice’s private key is, the only way Bob could have acquired a message encrypted by it is if Alice did indeed send it. While in jail for perjury and fraud, Alice will have much time to devise interesting new public-key algorithms.

794 NETWORK SECURITY CHAP. 8 Transmission line Alice's computer Bob's computer

Alice's

Bob's

Bob's

Alice's

P P

private key, DA

public key, EB

private key, DB

public key, EA

DA(P) EB (DA(P)) DA(P)

Figure 8-22. Digital signatures using public-key cryptography.

Although using public-key cryptography for digital signatures is an elegant scheme, there are problems that are related to the environment in which they oper- ate rather than to the basic algorithm. For one thing, Bob can prove that a message was sent by Alice only as long as DA remains secret. If Alice discloses her secret key, the argument no longer holds because anyone could have sent the message, in- cluding Bob himself.

The problem might arise, for example, if Bob is Alice’s stockbroker. Suppose that Alice tells Bob to buy a certain stock or bond. Immediately thereafter, the price drops sharply. To repudiate her message to Bob, Alice runs to the police claiming that her home was burglarized and the computer holding her key was

stolen. Depending on the laws in her state or country, she may or may not be legally liable, especially if she claims not to have discovered the break-in until get ting home from work, several hours after it allegedly happened.

Another problem with the signature scheme is what happens if Alice decides to change her key. Doing so is clearly legal, and it is probably a good idea to do so periodically. If a court case later arises, as described above, the judge will apply the current EA to DA(P) and discover that it does not produce P. Bob will look pretty stupid at this point.

In principle, any public-key algorithm can be used for digital signatures. The de facto industry standard is the RSA algorithm. Many security products use it. However, in 1991, NIST proposed using a variant of the El Gamal public-key algo rithm for its new Digital Signature Standard (DSS). El Gamal gets its security from the difficulty of computing discrete logarithms, rather than from the difficulty of factoring large numbers.

As usual when the government tries to dictate cryptographic standards, there was an uproar. DSS was criticized for being

1. Too secret (NSA designed the protocol for using El Gamal).

2. Too slow (10 to 40 times slower than RSA for checking signatures). 3. Too new (El Gamal had not yet been thoroughly analyzed).

4. Too insecure (fixed 512-bit key).

In a subsequent revision, the fourth point was rendered moot when keys up to 1024 bits were allowed. Nevertheless, the first two points remain valid.

SEC. 8.7 DIGITAL SIGNATURES 795 8.7.3 Message Digests

One criticism of signature methods is that they often couple two distinct func tions: authentication and secrecy. Often, authentication is needed but secrecy is not always needed. Also, getting an export license is often easier if the system in ques tion provides only authentication but not secrecy. Below we will describe an authentication scheme that does not require encrypting the entire message.

This scheme is based on the idea of a one-way hash function that takes an arbi trarily long piece of plaintext and from it computes a fixed-length bit string. This hash function, MD, often called a message digest, has four important properties:

1. Given P, it is easy to compute MD(P).

2. Given MD(P), it is effectively impossible to find P.

3. Given P, no one can find Pv such that MD(Pv) = MD(P).

4. A change to the input of even 1 bit produces a very different output.

To meet criterion 3, the hash should be at least 128 bits long, preferably more. To meet criterion 4, the hash must mangle the bits very thoroughly, not unlike the symmetric-key encryption algorithms we have seen.

Computing a message digest from a piece of plaintext is much faster than en- crypting that plaintext with a public-key algorithm, so message digests can be used to speed up digital signature algorithms. To see how this works, consider the sig- nature protocol of Fig. 8-21 again. Instead, of signing P with KBB(A, t, P), BB now computes the message digest by applying MD to P, yielding MD(P). BB then encloses KBB(A, t, MD(P)) as the fifth item in the list encrypted with KBthat is sent to Bob, instead of KBB(A, t, P). If a dispute arises, Bob can produce both P and KBB(A, t, MD(P)). After Big Brother has decrypted it for the judge, Bob has MD(P), which is guaranteed to be genuine, and the alleged P. However, since it is effectively impossible for Bob to find any other message that gives this hash, the judge will easily be convinced that Bob is telling the truth. Using message digests in this way saves both encryption time and message transport costs.

Message digests work in public-key cryptosystems, too, as shown in Fig. 8-23. Here, Alice first computes the message digest of her plaintext. She then signs the message digest and sends both the signed digest and the plaintext to Bob. If Trudy replaces P along the way, Bob will see this when he computes MD(P).

SHA-1, SHA-2 and SHA-3

A variety of message digest functions have been proposed. For a long time, one of the most widely used functions was SHA-1 (Secure Hash Algorithm 1) (NIST, 1993). Before we commence our explanation, it is important to realize that

796 NETWORK SECURITY CHAP. 8

e

c

i

l

A

P, DA (MD (P))

b o

B

Figure 8-23. Digital signatures using message digests.

SHA-1 has been broken since 2017 and is now being phased out by many systems, but more about this later. Like all message digests, SHA-1 operates by mangling bits in a sufficiently complicated way that every output bit is affected by every input bit. SHA-1 was developed by NSA and blessed by NIST in FIPS 180-1. It processes input data in 512-bit blocks, and it generates a 160-bit message digest. A typical way for Alice to send a nonsecret but signed message to Bob is illustrat- ed in Fig. 8-24. Here, her plaintext message is fed into the SHA-1 algorithm to get a 160-bit SHA-1 hash. Alice then signs the hash with her RSA private key and sends both the plaintext message and the signed hash to Bob.

Alice's

Alice's plaintext

private key, DA

160-Bit SHA-1

message M

hash of M

SHA-1

Signed hash

RSA

(arbitrary length)

algorithmH

DA(H)

algorithm

Sent

to Bob

Figure 8-24. Use of SHA-1 and RSA for signing nonsecret messages.

After receiving the message, Bob computes the SHA-1 hash himself and also applies Alice’s public key to the signed hash to get the original hash, H. If the two agree, the message is considered valid. Since there is no way for Trudy to modify the (plaintext) message while it is in transit and produce a new one that hashes to H, Bob can easily detect any changes Trudy has made to the message. For mes-

sages whose integrity is important but whose contents are not secret, the scheme of Fig. 8-24 is widely used. For a relatively small cost in computation, it guarantees that any modifications made to the plaintext message in transit can be detected with very high probability.

New versions of SHA-1 have been developed that produce hashes of 224, 256, 384, and 512 bits, respectively. Collectively, these versions are called SHA-2. Not only are these hashes longer than SHA-1 hashes, but the digest function has been changed to combat some potential weaknesses of SHA-1. The weaknesses are ser ious. In 2017, SHA-1 was broken by a team of researchers from Google and the

SEC. 8.7 DIGITAL SIGNATURES 797

CWI research center in Amsterdam. Specifically, the researchers were able to gen- erate hash collisions, essentially killing the security of SHA-1. Not surprisingly, the attack led to an increased interest in SHA-2.

In 2006, the National Institute of Standards and Technology (NIST) started organizing a competition for a new hash standard, which is now known as SHA-3. The competition closed in 2012. Three years later, the new SHA-3 standard (‘‘Keccak’’) was officially published. Interestingly, NIST does not suggest that we all dump SHA-2 in the trash and switch to SHA-3 because there are no successful attacks on SHA-2 yet. Even so, it is good to have a drop-in replacement lying around, just in case.

8.7.4 The Birthday Attack

In the world of crypto, nothing is ever what it seems to be. One might think that it would take on the order of 2moperations to subvert an m-bit message digest. In fact, 2m/2 operations will often do using a birthday attack, in an approach pub lished by Yuval (1979) in his now-classic paper ‘‘How to Swindle Rabin.’’

Remember, from our earlier discussion of the DNS birthday attack that if there is some mapping between inputs and outputs with n inputs (people, messages, etc.) and k possible outputs (birthdays, message digests, etc.), there are n(n < 1)/2 input pairs. If n(n < 1)/2 > k, the chance of having at least one match is pretty good. Thus, approximately, a match is likely for n > 3}k. This result means that a 64-bit message digest can probably be broken by generating about 232 messages and looking for two with the same message digest.

Let us look at a practical example. The Department of Computer Science at State University has one position for a tenured faculty member and two candidates, Tom and Dick. Tom was hired two years before Dick, so he goes up for review first. If he gets it, Dick is out of luck. Tom knows that the department chairperson, Marilyn, thinks highly of his work, so he asks her to write him a letter of recom- mendation to the Dean, who will decide on Tom’s case. Once sent, all letters be- come confidential.

Marilyn tells her secretary, Ellen, to write the Dean a letter, outlining what she wants in it. When it is ready, Marilyn will review it, compute and sign the 64-bit digest, and send it to the Dean. Ellen can send the letter later by email.

Unfortunately for Tom, Ellen is romantically involved with Dick and would like to do Tom in, so she writes the following letter with the 32 bracketed options:

Dear Dean Smith,

This [letter | message] is to give my [honest | frank] opinion of Prof. Tom Wil- son, who is [a candidate | up] for tenure [now | this year]. I have [known | worked with] Prof. Wilson for [about | almost] six years. He is an [outstanding | excellent] researcher of great [talent | ability] known [worldwide | internationally] for his [brilliant | creative] insights into [many | a wide variety of] [dif icult | challenging] problems.

798 NETWORK SECURITY CHAP. 8

He is also a [highly | greatly] [respected | admired] [teacher | educator]. His students give his [classes | courses] [rave | spectacular] reviews. He is [our | the Department’s] [most popular | best-loved] [teacher | instructor].

[In addition | Additionally] Prof. Wilson is a [gifted | ef ective] fund raiser. His [grants | contracts] have brought a [large | substantial] amount of money into [the | our] Department. [This money has | These funds have] [enabled | permitted] us to [pursue | carry out] many [special | important] programs, [such as | for example] your State 2025 program. Without these funds we would [be unable | not be able] to continue this program, which is so [important | essential] to both of us. I strongly urge you to grant him tenure.

Unfortunately for Tom, as soon as Ellen finishes composing and typing in this let ter, she also writes a second one:

Dear Dean Smith,

This [letter | message] is to give my [honest | frank] opinion of Prof. Tom Wil- son, who is [a candidate | up] for tenure [now | this year]. I have [known | worked with] Tom for [about | almost] six years. He is a [poor | weak] researcher not well known in his [field | area]. His research [hardly ever | rarely] shows [insight in | understanding of] the [key | major] problems of [the | our] day.

Furthermore, he is not a [respected | admired] [teacher | educator]. His stu- dents give his [classes | courses] [poor | bad ] reviews. He is [our | the Depart- ment’s] least popular [teacher | instructor], known [mostly | primarily] within [the | our] Department for his [tendency | propensity] to [ridicule | embarrass] students

[foolish | imprudent] enough to ask questions in his classes.

[In addition | Additionally] Tom is a [poor | marginal] fund raiser. His [grants | contracts] have brought only a [meager | insignificant] amount of money into [the | our] Department. Unless new [money is | funds are] quickly located, we may have to cancel some essential programs, such as your State 2025 program. Unfortunate ly, under these [conditions | circumstances] I cannot in good [conscience | faith] recommend him to you for [tenure | a permanent position].

32 message digests of each let

Now Ellen programs her computer to compute the 2

ter overnight. Chances are, one digest of the first letter will match one digest of the second. If not, she can add a few more options and try again tonight. Suppose that she finds a match. Call the ‘‘good’’ letter A and the ‘‘bad’’ one B.

Ellen now emails letter A to Marilyn for approval. Letter B she keeps secret, showing it to no one. Marilyn, of course, approves it, computes her 64-bit message digest, signs the digest, and emails the signed digest off to Dean Smith. Indepen- dently, Ellen emails letter B to the Dean (not letter A, as she is supposed to).

After getting the letter and signed message digest, the Dean runs the message digest algorithm on letter B, sees that it agrees with what Marilyn sent him, and fires Tom. The Dean does not realize that Ellen managed to generate two letters with the same message digest and sent her a different one than the one Marilyn saw and approved. (Optional ending: Ellen tells Dick what she did. Dick is appalled

SEC. 8.7 DIGITAL SIGNATURES 799

and breaks off the affair. Ellen is furious and confesses to Marilyn. Marilyn calls the Dean. Tom gets tenure after all.) With SHA-2, the birthday attack is difficult because even at the ridiculous speed of 1 trillion digests per second, it would take over 32,000 years to compute all 280 digests of two letters with 80 variants each, and even then a match is not guaranteed. However, with a cloud of 1,000,000 chips working in parallel, 32,000 years becomes 2 weeks.

8.8 MANAGEMENT OF PUBLIC KEYS

Public-key cryptography makes it possible for people who do not share a com- mon key in advance to nevertheless communicate securely. It also makes signing messages possible without the existence of a trusted third party. Finally, signed message digests make it possible for the recipient to verify the integrity of received messages easily and securely.

However, there is one problem that we have glossed over a bit too quickly: if Alice and Bob do not know each other, how do they get each other’s public keys to start the communication process? The obvious solution—put your public key on your Web site—does not work, for the following reason. Suppose that Alice wants to look up Bob’s public key on his Web site. How does she do it? She starts by typing in Bob’s URL. Her browser then looks up the DNS address of Bob’s home page and sends it a GET request, as shown in Fig. 8-25. Unfortunately, Trudy intercepts the request and replies with a fake home page, probably a copy of Bob’s home page except for the replacement of Bob’s public key with Trudy’s public key. When Alice now encrypts her first message with ET , Trudy decrypts it, reads it, re-encrypts it with Bob’s public key, and sends it to Bob, who is none the wiser that Trudy is reading his incoming messages. Worse yet, Trudy could modify the messages before reencrypting them for Bob. Clearly, some mechanism is needed to make sure that public keys can be exchanged securely.

1. GET Bob's home page

2. Fake home page with ET

Alice Trudy

Bob

3. ET(Message)

4. EB(Message)

Figure 8-25. A way for Trudy to subvert public-key encryption.

8.8.1 Certificates

As a first attempt at distributing public keys securely, we could imagine a KDC (Key Distribution Center) available online 24 hours a day to provide public keys on demand. One of the many problems with this solution is that it is not

800 NETWORK SECURITY CHAP. 8

scalable, and the key distribution center would rapidly become a bottleneck. Also, if it ever went down, Internet security would suddenly grind to a halt. For these reasons, people have developed a different solution, one that does not require the key distribution center to be online all the time. In fact, it does not have to be online at all. Instead, what it does is certify the public keys belonging to peo- ple, companies, and other organizations. An organization that certifies public keys is now called a CA (Certification Authority).

As an example, suppose that Bob wants to allow Alice and other people he does not know to communicate with him securely. He can go to the CA with his public key along with his passport or driver’s license and ask to be certified. The CA then issues a certificate similar to the one in Fig. 8-26 and signs its SHA-2 hash with the CA’s private key. Bob then pays the CA’s fee and gets a document containing the certificate and its signed hash (ideally not sent over unreliable chan- nels).

I hereby certify that the public key

19836A8B03030CF83737E3837837FC3s87092827262643FFA82710382828282A belongs to

Robert John Smith

12345 University Avenue

Berkeley, CA 94702

Birthday: July 4, 1958

Email: bob@superdupernet.com

SHA-2 hash of the above certificate signed with the CA’s private key

Figure 8-26. A possible certificate and its signed hash.

The fundamental job of a certificate is to bind a public key to the name of a principal (individual, company, etc.). Certificates themselves are not secret or pro tected. Bob might, for example, decide to put his new certificate on his Web site, with a link on the main page saying: click here for my public-key certificate. The resulting click would return both the certificate and the signature block (the signed SHA-2 hash of the certificate).

Now let us run through the scenario of Fig. 8-25 again. When Trudy intercepts Alice’s request for Bob’s home page, what can she do? She can put her own certif icate and signature block on the fake page, but when Alice reads the contents of the certificate she will immediately see that she is not talking to Bob because Bob’s name is not in it. Trudy can modify Bob’s home page on the fly, replacing Bob’s public key with her own. However, when Alice runs the SHA-2 algorithm on the certificate, she will get a hash that does not agree with the one she gets when she applies the CA’s well-known public key to the signature block. Since Trudy does not have the CA’s private key, she has no way of generating a signature block that contains the hash of the modified Web page with her public key on it. In this way, Alice can be sure she has Bob’s public key and not Trudy’s or someone else’s.

SEC. 8.8 MANAGEMENT OF PUBLIC KEYS 801

And as we promised, this scheme does not require the CA to be online for verifica tion, thus eliminating a potential bottleneck.

While the standard function of a certificate is to bind a public key to a princi- pal, a certificate can also be used to bind a public key to an attribute. For ex- ample, a certificate could say: ‘‘This public key belongs to someone over 18.’’ It could be used to prove that the owner of the private key was not a minor and thus allowed to access material not suitable for children, and so on, but without disclos ing the owner’s identity. Typically, the person holding the certificate would send it to the Web site, principal, or process that cared about age. That site, principal, or process would then generate a random number and encrypt it with the public key in the certificate. If the owner were able to decrypt it and send it back, that would be proof that the owner indeed had the attribute stated in the certificate. Alternatively, the random number could be used to generate a session key for the ensuing conver- sation.

Another example of where a certificate might contain an attribute is in an ob ject-oriented distributed system. Each object normally has multiple methods. The owner of the object could provide each customer with a certificate giving a bit map of which methods the customer is allowed to invoke and binding the bit map to a public key using a signed certificate. Again, if the certificate holder can prove possession of the corresponding private key, he will be allowed to perform the methods in the bit map. This approach has the property that the owner’s identity need not be known, a property useful in situations where privacy is important.

8.8.2 X.509

If everybody who wanted something signed went to the CA with a different kind of certificate, managing all the different formats would soon become a prob lem. To solve this problem, a standard for certificates has been devised and approved by the International Telecommunication Union (ITU). The standard is called X.509 and is in widespread use on the Internet. It has gone through three versions since the initial standardization in 1988. We will discuss version 3.

X.509 has been heavily influenced by the OSI world, borrowing some of its worst features (e.g., naming and encoding). Surprisingly, IETF went along with X.509, even though in nearly every other area, from machine addresses to transport protocols to email formats, IETF generally ignored OSI and tried to do it right. The IETF version of X.509 is described in RFC 5280.

At its core, X.509 is a way to describe certificates. The primary fields in a cer tificate are listed in Fig. 8-27. The descriptions given there should provide a gener- al idea of what the fields do. For additional information, please consult the stan- dard itself or RFC 2459.

For example, if Bob works in the loan department of the Money Bank, his X.500 address might be

/C=US/O=MoneyBank/OU=Loan/CN=Bob/

802 NETWORK SECURITY CHAP. 8

Field Meaning

Version Which version of X.509

Serial number This number plus the CA’s name uniquely identifies the certificate Signature algorithm The algorithm used to sign the certificate

Issuer X.500 name of the CA

Validity period The starting and ending times of the validity period Subject name The entity whose key is being certified

Public key The subject’s public key and the ID of the algorithm using it Issuer ID An optional ID uniquely identifying the certificate’s issuer Subject ID An optional ID uniquely identifying the certificate’s subject Extensions Many extensions have been defined

Signature The certificate’s signature (signed by the CA’s private key) Figure 8-27. The basic fields of an X.509 certificate.

where C is for country, O is for organization, OU is for organizational unit, and CN is for common name. CAs and other entities are named in a similar way. A substantial problem with X.500 names is that if Alice is trying to contact bob@moneybank.com and is given a certificate with an X.500 name, it may not be obvious to her that the certificate refers to the Bob she wants. Fortunately, starting with version 3, DNS names are now permitted instead of X.500 names, so this problem may eventually vanish.

Certificates are encoded using OSI ASN.1 (Abstract Syntax Notation 1), which is sort of like a struct in C, except with an extremely peculiar and verbose notation. More information about X.509 is given by Ford and Baum (2000).

8.8.3 Public Key Infrastructures

Having a single CA to issue all the world’s certificates obviously would not work. It would collapse under the load and be a central point of failure as well. A possible solution might be to have multiple CAs, all run by the same organization and all using the same private key to sign certificates. While this would solve the load and failure problems, it introduces a new problem: key leakage. If there were dozens of servers spread around the world, all holding the CA’s private key, the chance of the private key being stolen or otherwise leaking out would be greatly in- creased. Since the compromise of this key would ruin the world’s electronic secu rity infrastructure, having a single central CA is very risky.

In addition, which organization would operate the CA? It is hard to imagine any authority that would be accepted worldwide as legitimate and trustworthy. In some countries, people would insist that it be a government, while in other coun tries they would insist that it not be a government.

SEC. 8.8 MANAGEMENT OF PUBLIC KEYS 803

For these reasons, a different way for certifying public keys has evolved. It goes under the general name of PKI (Public Key Infrastructure). In this section, we will summarize how it works in general, although there have been many pro- posals, so the details will probably evolve in time.

A PKI has multiple components, including users, CAs, certificates, and direc tories. What the PKI does is provide a way of structuring these components and define standards for the various documents and protocols. A particularly simple form of PKI is a hierarchy of CAs, as depicted in Fig. 8-28. In this example, we have shown three levels, but in practice, there might be fewer or more. The top level CA, the root, certifies second-level CAs, which we here call RAs (Regional Authorities) because they might cover some geographic region, such as a country or continent. This term is not standard, though; in fact, no term is really standard for the different levels of the tree. These, in turn, certify the real CAs, which issue the X.509 certificates to organizations and individuals. When the root authorizes a new RA, it generates an X.509 certificate stating that it has approved the RA, in- cludes the new RA’s public key in it, signs it, and hands it to the RA. Similarly, when an RA approves a new CA, it produces and signs a certificate stating its approval and containing the CA’s public key.

RA 2 is approved.

Its public key is

47383AE349. . .

Root's signature

RA 1

Root RA 2 is approved.

Its public key is

47383AE349. . .

Root'ssignature

RA 2

CA 1 CA 2

CA 5 is approved. Its public key is 6384AF863B. . .

RA 2's signature

CA 5 is approved. Its public key is 6384AF863B. . .

RA 2's signature

CA 3 CA 4 CA 5

(a) (b)

Figure 8-28. (a) A hierarchical PKI. (b) A chain of certificates.

Our PKI works like this. Suppose that Alice needs Bob’s public key in order to communicate with him, so she looks for and finds a certificate containing it, signed by CA 5. But Alice has never heard of CA 5. For all she knows, CA 5 might be Bob’s 10-year-old daughter. She could go to CA 5 and say: ‘‘Prove your legitimacy.’’ CA 5 will respond with the certificate it got from RA 2, which con tains CA 5’s public key. Now armed with CA 5’s public key, she can verify that Bob’s certificate was indeed signed by CA 5 and isthus legal.

Unless RA 2 is Bob’s 12-year-old son. So, the next step is for her to ask RA 2 to prove it is legitimate. The response to her query is a certificate signed by the root and containing RA 2’spublic key. Now Alice issure she has Bob’s public key.

804 NETWORK SECURITY CHAP. 8

But how does Alice find the root’s public key? Magic. It is assumed that everyone knows the root’s public key. For example, her browser might have been shipped with the root’s public key built in.

Bob is a friendly sort of guy and does not want to cause Alice a lot of work. He knows that she will have to check out CA 5 and RA 2, so to save her some trou- ble, he collects the two needed certificates and gives her the two certificates along with his. Now she can use her own knowledge of the root’s public key to verify the top-level certificate and the public key contained therein to verify the second one. Alice does not need to contact anyone to do the verification. Because the certifi- cates are all signed, she can easily detect any attempts to tamper with their con tents. A chain of certificates going back to the root like this is sometimes called a chain of trust or a certification path. The technique is widely used in practice.

Of course, we still have the problem of who is going to run the root. The solu tion is not to have a single root, but to have many roots, each with its own RAs and CAs. In fact, modern browsers come preloaded with the public keys for over 100 roots, sometimes referred to as trust anchors. In this way, having a single world- wide trusted authority can be avoided.

But there is now the issue of how the browser vendor decides which purported trust anchors are reliable and which are sleazy. It all comes down to the user trust ing the browser vendor to make wise choices and not simply approve all trust anchors willing to pay its inclusion fee. Most browsers allow users to inspect the root keys (usually in the form of certificates signed by the root) and delete any that seem shady. For more information on PKIs, see Stapleton and Epstein (2016).

Directories

Another issue for any PKI is where certificates (and their chains back to some known trust anchor) are stored. One possibility is to have each user store his or her own certificates. While doing this is safe (i.e., there is no way for users to tamper with signed certificates without detection), it is also inconvenient. One alternative that has been proposed is to use DNS as a certificate directory. Before contacting Bob, Alice probably has to look up his IP address using DNS, so why not have DNS return Bob’s entire certificate chain along with his IP address?

Some people think this is the way to go, but others would prefer dedicated di rectory servers whose only job is managing X.509 certificates. Such directories could provide lookup services by using properties of the X.500 names. For ex- ample, in theory, such a directory service could answer queries like ‘‘Give me a list of all people named Alice who work in sales departments anywhere in the U.S.’’

Revocation

The real world is full of certificates, too, such as passports and drivers’ licenses. Sometimes these certificates can be revoked, for example, drivers’ licenses can be revoked for drunken driving and other driving offenses. The same

SEC. 8.8 MANAGEMENT OF PUBLIC KEYS 805

problem occurs in the digital world: the grantor of a certificate may decide to revoke it because the person or organization holding it has abused it in some way. It can also be revoked if the subject’s private key has been exposed or, worse yet, the CA’s private key has been compromised. Thus, a PKI needs to deal with the issue of revocation. The possibility of revocation complicates matters.

A first step in this direction is to have each CA periodically issue a CRL (Cer tificate Revocation List) giving the serial numbers of all certificates that it has revoked. Since certificates contain expiry times, the CRL need only contain the serial numbers of certificates that have not yet expired. Once its expiry time has passed, a certificate is automatically invalid, so no distinction is needed between those that just timed out and those that were actually revoked. In both cases, they cannot be used any more.

Unfortunately, introducing CRLs means that a user who is about to use a cer tificate must now acquire the CRL to see if the certificate has been revoked. If it has been, it should not be used. However, even if the certificate is not on the list, it might have been revoked just after the list was published. Thus, the only way to really be sure is to ask the CA. And on the next use of the same certificate, the CA has to be asked again, since the certificate might have been revoked a few seconds ago.

Another complication is that a revoked certificate could conceivably be rein- stated, for example, if it was revoked for nonpayment of some fee that has since been paid. Having to deal with revocation (and possibly reinstatement) eliminates one of the best properties of certificates, namely, that they can be used without hav ing to contact a CA.

Where should CRLs be stored? A good place would be the same place the cer tificates themselves are stored. One strategy is for the CA to actively push out CRLs periodically and have the directories process them by simply removing the revoked certificates. If directories are not used for storing certificates, the CRLs can be cached at various places around the network. Since a CRL is itself a signed document, if it is tampered with, that tampering can be easily detected.

If certificates have long lifetimes, the CRLs will be long, too. For example, if credit cards are valid for 5 years, the number of revocations outstanding will be much longer than if new cards are issued every 3 months. A standard way to deal with long CRLs is to issue a master list infrequently, but issue updates to it more often. Doing this reduces the bandwidth needed for distributing the CRLs.

8.9 AUTHENTICATION PROTOCOLS

Authentication is the technique by which a process verifies that its communi- cation partner is who it is supposed to be and not an imposter. Verifying the identi ty of a remote process in the face of a malicious, active intruder is surprisingly dif ficult and requires complex protocols based on cryptography. In this section, we

806 NETWORK SECURITY CHAP. 8

will study some of the many authentication protocols that are used on insecure computer networks.

As an aside, some people confuse authorization with authentication. Authen tication deals with the question of whether you are actually communicating with a specific process. Authorization is concerned with what that process is permitted to do. For example, say a client process contacts a file server and says: ‘‘I am Mirte’s process and I want to delete the file cookbook.old.’’ From the file server’s point of view, two questions must be answered:

1. Is this actually Mirte’s process (authentication)?

2. Is Mirte allowed to delete cookbook.old (authorization)?

Only after both of these questions have been unambiguously answered in the affir- mative can the requested action take place. The former question is really the key one. Once the file server knows to whom it is talking, checking authorization is just a matter of looking up entries in local tables or databases. For this reason, we will concentrate on authentication in this section.

The general model that essentially all authentication protocols use is this. Alice starts out by sending a message either to Bob or to a trusted KDC, which is expected to be honest. Several other message exchanges follow in various direc tions. As these messages are being sent, Trudy may intercept, modify, or replay them in order to trick Alice and Bob or just to gum up the works.

Nevertheless, when the protocol has been completed, Alice is sure she is talk ing to Bob and Bob is sure he is talking to Alice. Furthermore, in most of the pro tocols, the two of them will also have established a secret session key for use in the upcoming conversation. In practice, for performance reasons, all data traffic is en- crypted using symmetric-key cryptography (typically AES), although public-key cryptography is widely used for the authentication protocols themselves and for es tablishing the session key.

The point of using a new, randomly chosen session key for each new con- nection is to minimize the amount of traffic that gets sent with the users’ secret keys or public keys, to reduce the amount of ciphertext an intruder can obtain, and to minimize the damage done if a process crashes and its core dump (memory printout after a crash) falls into the wrong hands. Hopefully, the only key present then will be the session key. All the permanent keys should have been carefully zeroed out after the session was established.

8.9.1 Authentication Based on a Shared Secret Key

For our first authentication protocol, we will assume that Alice and Bob al ready share a secret key, KAB. This shared key might have been agreed upon on the telephone or in person, but, in any event, not on the (insecure) network.

SEC. 8.9 AUTHENTICATION PROTOCOLS 807

This protocol is based on a principle found in many authentication protocols: one party sends a random number to the other, who then transforms it in a special way and returns the result. Such protocols are called challenge-response proto- cols. In this and subsequent authentication protocols, the following notation will be used:

A, B are the identities of Alice and Bob.

Ri’s are the challenges, where i identifies the challenger. Ki’s are keys, where i indicates the owner. KSis the session key.

The message sequence for our first shared-key authentication protocol is illus trated in Fig. 8-29. In message 1, Alice sends her identity, A, to Bob in a way that Bob understands. Bob, of course, has no way of knowing whether this message came from Alice or from Trudy, so he chooses a challenge, a large random number, RB, and sends it back to ‘‘Alice’’ as message 2, in plaintext. Alice then encrypts the message with the key she shares with Bob and sends the ciphertext, KAB(RB), back in message 3. When Bob sees this message, he immediately knows that it came from Alice because Trudy does not know KAB and thus could not have gener- ated it. Furthermore, since RB was chosen randomly from a large space (say, 128-bit random numbers), it is very unlikely that Trudy would have seen RB and its response in an earlier session. It is equally unlikely that she could guess the cor rect response to any challenge.

1

A

2

RB

e

c

i

l

A

3 KAB(RB) 4

RA

5

KAB(RA)

b o

B

Figure 8-29. Two-way authentication using a challenge-response protocol.

At this point, Bob is sure he is talking to Alice, but Alice is not sure of any thing. For all Alice knows, Trudy might have intercepted message 1 and sent back RBin response. Maybe Bob died last night. To find out to whom she is talking, Alice picks a random number, RA, and sends it to Bob as plaintext, in message 4. When Bob responds with KAB(RA), Alice knows she is talking to Bob. If they wish to establish a session key now, Alice can pick one, KS, and send it to Bob en- crypted with KAB. The protocol of Fig. 8-29 contains five messages. Let us see if we can be clever and eliminate some of them. One approach is illustrated in Fig. 8-30. Here

808 NETWORK SECURITY CHAP. 8

Alice initiates the challenge-response protocol instead of waiting for Bob to do it. Similarly, while he is responding to Alice’s challenge, Bob sends his own. The en tire protocol can be reduced to three messages instead of five.

1

A, RA

e

c

i

l

A

2 RB,KAB(RA)

3

KAB(RB)

b o

B

Figure 8-30. A shortened two-way authentication protocol.

Is this new protocol an improvement over the original one? In one sense it is: it is shorter. Unfortunately, it is also wrong. Under certain circumstances, Trudy can defeat this protocol by using what is known as a reflection attack. In particu lar, Trudy can break it if it is possible to open multiple sessions with Bob at once. This situation would be true, for example, if Bob is a bank and is prepared to ac- cept many simultaneous connections from automated teller machines at once.

Trudy’s reflection attack is shown in Fig. 8-31. It starts out with Trudy claim ing she is Alice and sending RT . Bob responds, as usual, with his own challenge, RB. Now Trudy is stuck. What can she do? She does not know KAB(RB).

1

y d

u

r

T

A, RT

2 RB, KAB (RT)

3

A, RB

4 RB2,KAB (RB)

5

KAB (RB)

First session

b

o

Second session

B

First session

Figure 8-31. The reflection attack.

She can open a second session with message 3, supplying the RB taken from message 2 as her challenge. Bob calmly encrypts it and sends back KAB(RB) in message 4. We have shaded the messages on the second session to make them stand out. Now Trudy has the missing information, so she can complete the first session and abort the second one. Bob is now convinced that Trudy is Alice, so

SEC. 8.9 AUTHENTICATION PROTOCOLS 809

when she asks for her bank account balance, he gives it to her without question. Then when she asks him to transfer it all to a secret bank account in Switzerland, he does so without a moment’s hesitation.

The moral of this story is:

Designing a correct authentication protocol is much harder than it looks. The following four general rules often help the designer avoid common pitfalls:

1. Have the initiator prove who she is before the responder has to. This avoids Bob giving away valuable information before Trudy has to give any evidence of who she is.

2. Have the initiator and responder use different keys for proof, even if this means having two shared keys, KAB and KvAB.

3. Have the initiator and responder draw their challenges from different sets. For example, the initiator must use even numbers and the re- sponder must use odd numbers.

4. Make the protocol resistant to attacks involving a second parallel ses- sion in which information obtained in one session is used in a dif ferent one.

If even one of these rules is violated, the protocol can frequently be broken. Here, all four rules were violated, with disastrous consequences.

Now let us go take a closer look at Fig. 8-29. Surely that protocol is not sub ject to a reflection attack? Maybe. It is quite subtle. Trudy was able to defeat our protocol by using a reflection attack because it was possible to open a second ses- sion with Bob and trick him into answering his own questions. What would hap- pen if Alice were a general-purpose computer that also accepted multiple sessions,

rather than a person at a computer? Let us take a look what Trudy can do. To see how Trudy’s attack works, see Fig. 8-32. Alice starts out by announc ing her identity in message 1. Trudy intercepts this message and begins her own session with message 2, claiming to be Bob. Again we have shaded the session 2 messages. Alice responds to message 2 by saying in message 3: ‘‘You claim to be Bob? Prove it.’’ At this point, Trudy is stuck because she cannot prove she is Bob. What does Trudy do now? She goes back to the first session, where it is her turn to send a challenge, and sends the RA she got in message 3. Alice kindly re- sponds to it in message 5, thus supplying Trudy with the information she needs to send in message 6 in session 2. At this point, Trudy is basically home free because she has successfully responded to Alice’s challenge in session 2. She can now can- cel session 1, send over any old number for the rest of session 2, and she will have an authenticated session with Alice in session 2.

But Trudy is a perfectionist, and she really wants to show off her considerable skills. Instead, of sending any old number over to complete session 2, she waits

810 NETWORK SECURITY CHAP. 8 1

e

c

i

l

A

A

2

B

3

RA

4

RA

5

KAB (RA)

6 KAB(RA) 7 RA2

8

RA2

9 KAB (RA2) 10 KAB (RA2)

First session

Second session

First session

y

d

u

Second session

r

T

First session

Second session

First session

Figure 8-32. A reflection attack on the protocol of Fig. 8-29.

until Alice sends message 7, Alice’s challenge for session 1. Of course, Trudy does not know how to respond, so she uses the reflection attack again, sending back RA2 as message 8. Alice conveniently encrypts RA2in message 9. Trudy now switches back to session 1 and sends Alice the number she wants in message

10, conveniently copied from what Alice sent in message 9. At this point, Trudy has two fully authenticated sessions with Alice.

This attack has a somewhat different result than the attack on the three-mes- sage protocol that we saw in Fig. 8-31. This time, Trudy has two authenticated connections with Alice. In the previous example, she had one authenticated con- nection with Bob. Again here, if we had applied all the general authentication pro tocol rules discussed earlier, this attack could have been stopped. For a detailed discussion of these kinds of attacks and how to thwart them, see Bird et al. (1993). They also show how it is possible to systematically construct protocols that are provably correct. The simplest such protocol is nevertheless fairly complicated, so we will now show a different class of protocol that also works.

The new authentication protocol is shown in Fig. 8-33 (Bird et al., 1993). It uses a HMAC (Hashed Message Authentication Code) which guarantees the in tegrity and authenticity of a message. A simple, yet powerful HMAC consists of a hash over the message plus the shared key. By sending the HMAC along with the rest of the message, no attacker is able to change or spoof the message: changing any bit would lead to an incorrect hash, and generating a valid hash is not possible without the key. HMACs are attractive because they can be generated very ef ficiently (faster than running SHA-2 and then running RSA on the result).

AZ

Chapter 8: Computer Networks

8

NETWORK SECURITY

For the first few decades of their existence, computer networks were primarily used by university researchers for sending email and by corporate employees for sharing printers. Under these conditions, security did not get a lot of attention. But now, as millions of ordinary citizens are using networks for banking, shopping, and filing their tax returns, and weakness after weakness has been found, network security has become a problem of massive proportions. In this chapter, we will study network security from several angles, point out numerous pitfalls, and dis- cuss many algorithms and protocols for making networks more secure.

On a historical note, network hacking already existed long before there was an Internet. Instead, the telephone network was the target and messing around with the signaling protocol was known as phone phreaking. Phone phreaking started in the late 1950s, and really took off in the 1960s and 1970s. In those days, the

control signals used to authorize and route calls, were still ‘‘in band’’: the phone company used sounds at specific frequencies in the same channel as the voice com- munication to tell the switches what to do.

One of the best-known phone phreakers is John Draper, a controversial figure who found that the toy whistle included in the boxes of Cap’n Crunch cereals in the late 1960s emitted a tone of exactly 2600 Hz which happened to be the fre- quency that AT&T used to authorize long-distance calls. Using the whistle, Draper was able to make long distance calls for free. Draper became known as Captain Crunch and used the whistles to build so-called blue boxes to hack the telephone

731

732 NETWORK SECURITY CHAP. 8

system. In 1974, Draper was arrested for toll fraud and went to jail, but not before he had inspired two other pioneers in the Bay area, Steve Wozniak and Steve Jobs, to also engage in phone phreaking and build their own blue boxes, as well as, at a later stage, a computer that they decided to call Apple. According to Wozniak, there would have been no Apple without Captain Crunch.

Security is a broad topic and covers a multitude of sins. In its simplest form, it is concerned with making sure that nosy people cannot read, or worse yet, secretly modify messages intended for other recipients. It is also concerned with attackers who try to subvert essential network services such as BGP or DNS, render links or network services unavailable, or access remote services that they are not authorized to use. Another topic of interest is how to tell whether that message purportedly from the IRS ‘‘Pay by Friday, or else’’ is really from the IRS and not from the Mafia. Security additionally deals with the problems of legitimate messages being captured and replayed, and with people later trying to deny that they sent certain messages.

Most security problems are intentionally caused by malicious people trying to gain some benefit, get attention, or harm someone. A few of the most common perpetrators are listed in Fig. 8-1. It should be clear from this list that making a network secure involves a lot more than just keeping it free of programming errors. It involves outsmarting often intelligent, dedicated, and sometimes well-funded adversaries. Measures that will thwart casual attackers will have little impact on the serious ones.

In an article in USENIX ;Login:, James Mickens of Microsoft (and now a pro fessor at Harvard University) argued that you should distinguish between everyday attackers and, say, sophisticated intelligence services. If you are worried about garden-variety adversaries, you will be fine with common sense and basic security measures. Mickens eloquently explains the distinction:

‘‘If your adversary is the Mossad, you’re gonna die and there’s nothing that you can do about it. The Mossad is not intimidated by the fact that you employ https://. If the Mossad wants your data, they’re going to use a drone to replace your cell- phone with a piece of uranium that’s shaped like a cellphone, and when you die of tumors filled with tumors, they’re going to hold a press conference and say ‘‘It wasn’t us’’ as they wear t-shirts that say ‘‘IT WAS DEFINITELY US’’ and then they’re going to buy all of your stuf at your estate sale so that they can directly look at the photos of your vacation instead of reading your insipid emails about them.’’

Mickens’ point is that sophisticated attackers have advanced means to compro- mise your systems and stopping them is very hard. In addition, police records show that the most damaging attacks are often perpetrated by insiders bearing a grudge. Security systems should be designed accordingly.

Sec 8.1 FUNDAMENTALS OF NETWORK SECURITY 733

Adversary Goal

Student To have fun snooping on people’s email

Cracker To test someone’s security system; steal data

Sales rep To claim to represent all of Europe, not just Andorra

Corporation To discover a competitor’s strategic marketing plan

Ex-employee To get revenge for being fired

Accountant To embezzle money from a company

Stockbroker To deny a promise made to a customer by email

Identity thief To steal credit card numbers for sale

Government To learn an enemy’s military or industrial secrets

Terrorist To steal biological warfare secrets

Figure 8-1. Some people who may cause security problems, and why.

8.1 FUNDAMENTALS OF NETWORK SECURITY

The classic way to deal with network security problems is to distinguish three essential security properties: confidentiality, integrity, and availability. The com- mon abbreviation, CIA, is perhaps a bit unfortunate, given that the other common expansion of that acronym has not been shy in violating those properties in the past. Confidentiality has to do with keeping information out of the grubby little hands of unauthorized users. This is what often comes to mind when people think about network security. Integrity is all about ensuring that the information you re- ceived was really the information sent and not something that an adversary modi fied. Availability deals with preventing systems and services from becoming un- usable due to crashes, overload situations, or deliberate misconfigurations. Good examples of attempts to compromise availability are the denial-of-service attacks that frequently wreak havoc on high-value targets such as banks, airlines and the local high school during exam time. In addition to the classic triumvirate of confi- dentiality, integrity, and availability that dominates the security domain, there are other issues that play important roles also. In particular, authentication deals with determining whom you are talking to before revealing sensitive information or entering into a business deal. Finally, nonrepudiation deals with signatures: how do you prove that your customer really placed an electronic order for 10 million left-handed doohickeys at 89 cents each when he later claims the price was 69 cents? Or maybe he claims he never placed any order after seeing that a Chinese firm is flooding the market with left-handed doohickeys for 49 cents.

All these issues occur in traditional systems, too, but with some significant differences. Integrity and secrecy are achieved by using registered mail and lock ing documents up. Robbing the mail train is harder now than it was in Jesse James’ day. Also, people can usually tell the difference between an original paper

734 NETWORK SECURITY CHAP. 8

document and a photocopy, and it often matters to them. As a test, make a photo- copy of a valid check. Try cashing the original check at your bank on Monday. Now try cashing the photocopy of the check on Tuesday. Observe the difference in the bank’s behavior.

As for authentication, people authenticate other people by various means, in- cluding recognizing their faces, voices, and handwriting. Proof of signing is hand led by signatures on letterhead paper, raised seals, and so on. Tampering can usually be detected by handwriting, ink, and paper experts. None of these options are available electronically. Clearly, other solutions are needed.

Before getting into the solutions themselves, it is worth spending a few moments considering where in the protocol stack network security belongs. There is probably no one single place. Every layer has something to contribute. In the physical layer, wiretapping can be foiled by enclosing transmission lines (or better yet, optical fibers) in sealed metal tubes containing an inert gas at high pressure. Any attempt to drill into a tube will release some gas, reducing the pressure and triggering an alarm. Some military systems use this technique.

In the data link layer, packets on a point-to-point link can be encrypted as they leave one machine and decrypted as they enter another. All the details can be handled in the data link layer, with higher layers oblivious to what is going on. This solution breaks down when packets have to traverse multiple routers, howev- er, because packets have to be decrypted at each router, leaving them vulnerable to attacks from within the router. Also, it does not allow some sessions to be protect- ed (e.g., those involving online purchases by credit card) and others not. Neverthe less, link encryption, as this method is called, can be added to any network easily and is often useful.

In the network layer, firewalls can be deployed to prevent attack traffic from entering or leaving networks. IPsec, a protocol for IP security that encrypts packet payloads, also functions at this layer. At the transport layer, entire connections can be encrypted end-to-end, that is, process to process. Problems such as user authentication and nonrepudiation are often handled at the application layer, although occasionally (e.g., in the case of wireless networks), user authentication can take place at lower layers. Since security applies to all layers of the network protocol stack, we dedicate an entire chapter of the book to this topic.

8.1.1 Fundamental Security Principles

While addressing security concerns in all layers of the network stack is cer tainly necessary, it is very difficult to determine when you have addressed them sufficiently and if you have addressed them all. In other words, guaranteeing secu rity is hard. Instead, we try to improve security as much as we can by consistently applying a set of security principles. Classic security principles were formulated as early as 1975 by Jerome Saltzer and Michael Schroeder:

SEC. 8.1 FUNDAMENTALS OF NETWORK SECURITY 735

1. Principle of economy of mechanism. This principle is sometimes paraphrased as the principle of simplicity. Complex systems tend to have more bugs than simple systems. Moreover, users may not under- stand them well and use them in a wrong or insecure way. Simple sys tems are good systems. For instance, PGP (Pretty Good Privacy, see Sec. 8.11), offers powerful protection for email. However, many users find it cumbersome in practice and so far it has not yet gained very widespread adoption. Simplicity also helps to minimize the attack surface (all the points where an attacker may interact with the system to try to compromise it). A system that offers a large set of functions to untrusted users, each implemented by many lines of code, has a large attack surface. If a function is not really needed, leave it out.

2. Principle of fail-safe defaults. Say you need to organize the access to a resource. It is better to make explicit rules about when one can access the resource than trying to identify the condition under which access to the resource should be denied. Phrased differently: a default of lack of permission is safer.

3. Principle of complete mediation. Every access to every resource should be checked for authority. It implies that we must have a way to determine the source of a request (the requester).

4. Principle of least authority. This principle, often known as POLA, states that any (sub) system should have just enough authority (privi lege) to perform its task and no more. Thus, if attackers compromise such a system, they elevate their privilege by only the bare minimum.

5. Principle of privilege separation. Closely related to the previous point: it is better to split up the system into multiple POLA-compliant components than a single component with all the privileges combin- ed. Again, if one component is compromised, the attackers will be limited in what they can do.

6. Principle of least common mechanism. This principle is a little trickier and states that we should minimize the amount of mechanism common to more than one user and depended on by all users. Think of it this way: if we have a choice between implementing a network routine in the operating system where its global variables are shared by all users, or in a user space library which, to all intents and pur- poses, is private to the user process, we should opt for the latter. The shared data in the operating system may well serve as an information path between different users. We shall see an example of this in the section on TCP connection hijacking.

736 NETWORK SECURITY CHAP. 8

7. Principle of open design. This states plain and simple that the de- sign should not be secret and generalizes what is known as Kerck- hoffs’ principle in cryptography. In 1883, the Dutch-born Auguste Kerckhoffs published two journal articles on military cryptography which stated that a cryptosystem should be secure even if everything about the system, except the key, is public knowledge. In other words, do not rely on ‘‘security by obscurity,’’ but assume that the adversary immediately gains familiarity with your system and knows the en- cryption and decryption algorithms.

8. Principle of psychological acceptability. The final principle is not a technical one at all. Security rules and mechanisms should be easy to use and understand. Again, many implementations of PGP protection for email fail this principle. However, acceptability entails more. Besides the usability of the mechanism, it should also be clear why the rules and mechanisms are necessary in the first place.

An important factor in ensuring security is also the concept of isolation. Isola tion guarantees the separation of components (programs, computer systems, or even entire networks) that belong to different security domains or have different

privileges. All interaction that takes place between the different components is mediated with proper privilege checks. Isolation, POLA, and a tight control of the flow of information between components allow the design of strongly compart- mentalized systems.

Network security comprises concerns in the domain of systems and engineer ing as well as concerns rooted in theory, math, and cryptography. A good example of the former is the classic ping of death, which allowed attackers to crash hosts all over the Internet by using fragmentation options in IP to craft ICMP echo re- quest packets larger than the maximum allowed IP packet size. Since the receiving side never expected such large packets, it reserved insufficient buffer memory for all the data and the excess bytes would overwrite other data that followed the buff- er in memory. Clearly, this was a bug, commonly known as a buffer overflow. An example of a cryptography problem is the 40-bit key used in the original WEP en- cryption for WiFi networks which could be easily brute-forced by attackers with sufficient computational power.

8.1.2 Fundamental Attack Principles

The easiest way to structure a discussion about systems aspects of security is to put ourselves in the shoes of the adversary. So, having introduced fundamental as- pects of security above, let us now consider the fundamentals of attacks.

From an attacker perspective, the security of a system presents itself as a set of challenges that attackers must solve to reach their objectives. There are multiple ways to violate confidentiality, integrity, availability, or any of the other security

SEC. 8.1 FUNDAMENTALS OF NETWORK SECURITY 737

properties. For instance, to break confidentiality of network traffic, an attacker may break into a system to read the data directly, trick the communicating parties to send data without encryption and capture it, or, in a more ambitious scenario, break the encryption. All of these are used in practice and all of them consist of multiple

steps. We will deep dive into the fundamentals of attacks in Sec. 8.2. As a preview, let us consider the various steps and approaches attackers may use.

1. Reconnaissance. Alexander Graham Bell once said: ‘‘Preparation is the key to success.’’ and thus it is for attackers also. The first thing you do as an attacker is to get to know as much about your target as you can. In case you plan to attack by means of spam or social engin- eering, you may want to spend some time sifting through the online profiles of the people you want to trick into giving up information, or even engage in some old-fashioned dumpster diving. In this chapter, however, we limit ourselves to technical aspects of attacks and defenses. Reconnaissance in network security is about discovering information that helps the attacker. Which machines can we reach from the outside? Using which protocols? What is the topology of the network? What services run on which machines? Et cetera. We will discuss reconnaissance in Sec. 8.2.1

2. Sniffing and Snooping. An important step in many network attacks concerns the interception of network packets. Certainly if sensitive information is sent ‘‘in the clear’’ (without encryption), the ability to intercept network traffic is very useful for the attacker, but even en- crypted traffic can be useful—to find out the MAC addresses of com- municating parties, who talks to whom and when, etc. Moreover, an attacker needs to intercept the encrypted traffic to break the en- cryption. Since an attacker has access to other people’s network traf fic, the ability to sniff indicates that at least the principles of least au thority and complete mediation are not sufficiently enforced. Sniffing is easy on a broadcast medium such as WiFi, but how to intercept traffic if it does not even travel over the link to which your computer is connected? Sniffing is the topic of Sec. 8.2.2.

3. Spoofing. Another basic weapon in the hands of attackers is mas- querading as someone else. Spoofed network traffic pretends to origi- nate from some other machine. For instance, we can easily transmit an Ethernet frame or IP packet with a different source address, as a means to bypass a defense or launch denial-of-service attacks, be- cause these protocols are very simple. However, can we also do so for complicated protocols such as TCP? After all, if you send a TCP SYN segment to set up a connection to a server with a spoofed IP ad- dress, the server will reply with its SYN/ACk segment (the second phase of the connection setup) to that IP address, so unless the

738 NETWORK SECURITY CHAP. 8

attackers are on the same network segment, they will not see the re- ply. Without that reply, they will not know the sequence number used by the server, and hence, they will not be able to communicate. Spoofing circumvents the principle of complete mediation: if we can- not determine who sent a request, we cannot properly mediate it. In Sec. 8.2.3, we discuss spoofing in detail.

4. Disruption. The third component of our CIA triad, availability, has grown in importance also for attackers, with devastating DoS (Denial of Service) attacks on all sorts of organizations. Moreover, in re- sponse to new defenses, these attacks have grown ever more sophisti- cated. One can argue that DoS attacks abuse the fact that the prin- ciple of least common mechanism is not rigorously enforced—there is insufficient isolation. In Sec. 8.2.4, we will look at the evolution of such attacks.

Using these fundamental building blocks, attackers can craft a wide range of attacks. For instance, using reconnaissance and sniffing, attackers may find the ad- dress of a potential victim computer and discover that it trusts a server so that any request coming from that server is automatically accepted. By means of a denial-of-service (disruption) attack they can bring down the real server to make sure it does not respond to the victim any more and then send spoofed requests that appear to originate from the server. In fact, this is exactly how one of the most famous attacks in the history of the Internet (on the San Diego Supercomputer Center) happened. We will discuss the attack later.

8.1.3 From Threats to Solutions

After discussing the attacker’s moves, we will consider what we can do about them. Since most attacks arrive over the network, the security community quickly realized that the network may also be a good place to monitor for attacks. In Sec. 8.3, we will look at firewalls, intrusion detection systems and similar defenses.

Where Secs. 8.2 and 8.3 address the systems-related issues of attackers getting their grubby little hands on sensitive information or systems, we devote Secs. 8.4–8.9 to the more formal aspects of network security, when we discuss cryptog- raphy and authentication. Rooted in mathematics and implemented in computer systems, a variety of cryptographic primitives help ensure that even if network traf fic falls in the wrong hands, nothing too bad can happen. For instance, attackers will still not be able to break confidentiality, tamper with the content, or suc- cessfully replay a network conversation. There is a lot to say about cryptography, as there are different types of primitives for different purposes (proving authentic ity, encryption using public keys, encryption using symmetric keys, etc.) and each type tends to have different implementations. In Sec. 8.4, we introduce the key concepts of cryptography, and Sections 8.5 and 8.6 discuss symmetric and public

SEC. 8.1 FUNDAMENTALS OF NETWORK SECURITY 739

key cryptography, respectively. We explore digital signatures in Sec. 8.7 and key management in Sec. 8.8.

Sec. 8.9 discusses the fundamental problem of secure authentication . Authentication is that which prevents spoofing altogether: the technique by which a process verifies that its communication partner is who it is supposed to be and not an imposter. As security became increasingly important, the community devel- oped a variety of authentication protocols. As we shall see, they tend to build on cryptography.

In the sections following authentication, we survey concrete examples of (often crypto-based) network security solutions. In Sec. 8.10, we discuss network tech- nologies that provide communication security, such as IPsec, VPNs, and Wireless security. Section 8.11 looks at the problem of email security, including explana tions of PGP (Pretty Good Privacy) and S/MIME (Secure Multipurpose Internet Mail Extension). Section 8.12 discusses security in the wider Web domain, with descriptions of secure DNS (DNSSEC), scripting code that runs in browsers, and the Secure Sockets Layer (SSL). As we shall see, these technologies use many of the ideas discussed in the preceding sections.

Finally, we discuss social issues in Sec. 8.13. What are the implications for im- portant rights, such as privacy and freedom of speech? What about copyright and protection of intellectual property? Security is an important topic so looking at it closely is worthwhile.

Before diving in, we should reiterate that security is an entire field of study in its own right. In this chapter, we focus only on networks and communication, rath- er than issues related to hardware, operating systems, applications, or users. This means that we will not spend much time looking at bugs and there is nothing here about user authentication using biometrics, password security, buffer overflow at tacks, Trojan horses, login spoofing, process isolation, or viruses. All of these top ics are covered at length in Chap. 9 of Modern Operating Systems (Tanenbaum and Bos, 2015). The interested reader is referred to that book for the systems aspects of security. Now let us begin our journey.

8.2 THE CORE INGREDIENTS OF AN ATTACK

As a first step, let us consider the fundamental ingredients that make up an at tack. Virtually all network attacks follow a recipe that mixes some variants of these ingredients in a clever manner.

8.2.1 Reconnaissance

Say you are an attacker and one fine morning you decide that you will hack or- ganization X, where do you start? You do not have much information about the or- ganization and, physically, you are an Internet away from the nearest office, so

740 NETWORK SECURITY CHAP. 8

dumpster diving or shoulder surfing are not options. You can always use social engineering, to try and extract sensitive information from employees by sending them emails (spam), or phoning them, or befriending them on social networks, but in this book, we are interested in more technical issues, related to computer net- works. For instance, can you find out what computers exist in the organization, how they are connected, and what services they run?

As a starting point, we assume that an attacker has a few IP addresses of ma- chines in the organization: Web servers, name servers, login servers, or any other machines that communicate with the outside world. The first thing the attacker will want to do is explore that server. Which TCP and UDP ports are open? An easy way to find out is simply to try and set up a TCP connection to each and every port number. If the connection is successful, there was a service listening. For instance, if the server replies on port 25, it suggests an SMTP server is present, if the con- nection succeeds on port 80, there will likely be a Web server, etc. We can use a similar technique for UDP (e.g., if the target replies on UDP port 53, we know it runs a domain name service because that is the port reserved for DNS).

Port Scanning

Probing a machine to see which ports are active is known as port scanning and may get fairly sophisticated. The technique we described earlier, where an at tacker sets up a full TCP connection to the target (a so-called connect scan) is not sophisticated at all. While effective, its major drawback is that it is very visible to the target’s security team. Many servers tend to log successful TCP connections, and showing up in logs during the reconnaissance phase is not what an attacker wants. To avoid this, she can make the connections deliberately unsuccessful by means of a half-open scan. A half-open scan only pretends to set up connections: it sends TCP packets with the SYN flag set to all port numbers of interest and waits for the server to send the corresponding SYN/ACKs for the ports that are open, but it never completes the three-way handshake. Most servers will not log these unsuccessful connection attempts.

If half-open scans are better than connect scans, why do we still discuss the lat ter? The reason is that half-open scans require more advanced attackers. A full connection to a TCP port is typically possible from most machines using simple tools such as telnet, that are often available to unprivileged users. For a half-open scan, however, attackers need to determine exactly which packets should and

should not be transmitted. Most systems do not have standard tools for nonprivi leged users to do this and only users with administrator privileges can perform a half-open scan.

Connect scans (sometimes referred to as open scans) and half-open scans both assume that it is possible to initiate a TCP connection from an arbitrary machine outside the victim’s network. However, perhaps the firewall does not allow con- nections to be set up from the attacker’s machine. For instance, it may block all

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 741

SYN segments. In that case, the attacker may have to resort to more esoteric scan- ning techniques. For instance, rather than a SYN segment, a FIN scan will send a TCP FIN segment, which is normally used to close a connection. At first sight, this does not make sense because there is no connection to terminate. However, the response to the FIN packet is often different for open ports (with listening services behind them) and closed ports. In particular, many TCP implementations send a TCP RST packet if the port is closed, and nothing at all if it is open. Fig. 8-2 illus trates these three basic scanning techniques.

SYN

SYN

FIN

Port 80 Port 80

Port 80

SYN/ACK ACK

SYN/ACK RST

Server Server Server

(a) Connect scan: connection established implies port is open

(b) Half open scan: SYN/ACK reply implies port open

(c) FIN scan: RST reply implies port is closed

Figure 8-2. Basic port scanning techniques. (a) Connect scan. (b) Half-open scan. (c) FIN scan.

By this time, you are probably thinking: ‘‘If we can do this with the SYN flags and the FIN flags, can we try some of the other flags?’’ You would be right. Any configuration that leads to different responses for open and closed ports works. A well-known other option is to set many flags at once (FIN, PSH, URG), something known as Xmas scan (because your packet is lit up like a Christmas tree).

Consider Fig. 8-2(a). If a connection can be established, it means the port is open. Now look at Fig. 8-2(b). A SYN/ACK reply implies the port is open. Final ly, we have Fig. 8-2(c). An RST reply means the port is open.

Probing for open ports is a first step. The next thing the attacker wants to know is exactly what server runs on this port, what software, what version of the soft- ware, and on what operating system. For instance, suppose we find that port 8080 is open. This is probably a Web server, although this is not certain. Even if it is a Web server, which one is it: Nginx, Lighttpd, Apache? Suppose an attacker only has an exploit for Apache version 2.4.37 and only on Windows, finding out all these details, known as fingerprinting is important. Just like in our port scans, we do so by making use of (sometimes subtle) differences in the way these servers and operating systems reply. If all of this sounds complicated, do not worry. Like many complicated things in computer networks, some helpful soul has sat down and

742 NETWORK SECURITY CHAP. 8

implemented all these scanning and fingerprinting techniques for you in friendly and versatile programs such as netmap and zmap.

Traceroute

Knowing which services are active on one machine is fine and dandy, but what about the rest of the machines in the network? Given knowledge of that first IP ad- dress, attackers may try to ‘‘poke around’’ to see what else is available. For in- stance, if the first machine has IP address 130.37.193.191, they might also try 130.37.193.192, 130.37.193.193, and all other possible addresses on the local net- work. Moreover, they can use programs such as traceroute to find the path toward the original IP address. Traceroute first sends a small batch of UDP packets to the target with the time-to-live (TTL) value set to one, then another batch with the TTL set to two, then a batch with a TTL of three, and so on. The first router lowers the TTL and immediately drops the first packets (because the TTL has now reached zero), and sends back an ICMP error message indicating that the packets have outlived their allocated life span. The second router does the same for the sec- ond batch of packets, the third for the third batch, until eventually some UDP pack- ets reach the target. By collecting the ICMP error packets and their source IP ad- dresses, traceroute is able to stitch together the overall route. Attackers can use the results to scan even more targets by probing address ranges of routers close to the target, thus obtaining a rudimentary knowledge of the network topology.

8.2.2 Sniffing and Snooping (with a Dash of Spoofing)

Many network attacks start with the interception of network traffic. For this at tack ingredient, we assume that the attacker has a presence in the victim’s network. For instance, the attacker brings a laptop in range of the victim’s WiFi network, or obtains access to a PC in the wired network. Sniffing on a broadcast medium, such as WiFi or the original Ethernet implementation is easy: you just tune into the channel at a convenient location, and listen for the bits come thundering by. To do so, attackers set their network interfaces in promiscuous mode, to make it accept all packets on the channel, even those destined for another host, and use tools such as tcpdump or Wireshark to capture the traffic.

Sniffing in Switched Networks

However, in many networks, things are not so easy. Take modern Ethernet as an example. Unlike its original incarnations, Ethernet today is no longer a proper shared-medium network technology. All communication is switched and attackers, even if they are connected to the same network segment, will never receive any of the Ethernet frames destined for the other hosts on the segment. Specifically, recall

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 743

that Ethernet switches are self-learning and quickly build up a forwarding table. The self-learning is simple and effective: as soon as an Ethernet frame from host A arrives at port 1, the switch records that traffic for host A should be sent on port 1. Now it knows that all traffic with host A’s MAC address in the destination field of the Ethernet header should be forwarded on port 1. Likewise, it will send the traffic for host B on port 2, and so on. Once the forwarding table is complete, the switch will no longer send any traffic explicitly addressed to host B on any port other than 2. To sniff traffic, attackers must find a way to make exactly that happen.

There are several ways for an attacker to overcome the switching problem. They all use spoofing. Nevertheless, we will discuss them in this section, since the sole goal here is to sniff traffic.

The first is MAC cloning, duplicating the MAC address of the host of which you want to sniff the traffic. If you claim to have this MAC address (by sending out Ethernet frames with that address), the switch will duly record this in its table and henceforth send all traffic bound for the victim to your machine instead. Of course, this assumes that you know this address, but you should be able to obtain it from the ARP requests sent by the target that are, after all, broadcast to all hosts in the network segment. Another complicating factor is that your mapping will be re- moved from the switch as soon as the original owner of the MAC address starts communicating again, so you will have to repeat this switch table poisoning con- stantly.

As an alternative, but in the same vein, attackers can use the fact that the switch table has a limited size and flood the switch with Ethernet frames with fake source addresses. The switch does not know the MAC addresses are fake and sim- ply records them until the table is full, evicting older entries to include the new ones if need be. Since the switch now no longer has an entry for the target host, it reverts to broadcast for all traffic towards it. MAC flooding makes your Ethernet behave like a broadcast medium again and party like it is 1979.

Instead of confusing the switch, attackers can also target hosts directly in a so-called ARP spoofing or ARP poisoning attack. Recall from Chap. 5 that the ARP protocol helps a computer find the MAC address corresponding to an IP ad- dress. For this purpose, the ARP implementation on a machine maintains a table with mappings from IP to MAC addresses for all hosts that have communicated with this machine (the ARP table). Each entry has a time-to-live (TTL) of, typi- cally, a few tens of minutes. After that, the MAC address of the remote party is silently forgotten, assuming there is no further communication between these par ties (in which case the TTL is reset), and all subsequent communication requires an ARP lookup first. The ARP lookup is simply a broadcast message that says something like: ‘‘Folks, I am looking for the MAC address of the host with IP ad- dress 192.168.2.24. If this is you, please let me know.’’ The lookup request con tains the requester’s MAC address, so host 192.168.2.24 knows where to send the reply, and also the requester’s IP address, so 192.168.2.24 can add the IP to MAC address of the requester to its own ARP table.

744 NETWORK SECURITY CHAP. 8

Whenever the attacker sees such an ARP request for host 192.168.2.24, she can race to supply the requester with her own MAC address. In that case, all com- munication for 192.168.2.24 will be sent to the attacker’s machine. In fact, since ARP implementations tend to be simple and stateless, the attacker can often just send ARP replies even if there was no request at all: the ARP implementation will accept the replies at face value and store the mappings in its ARP table.

By using this same trick on both communicating parties, the attacker receives all the traffic between them. By subsequently forwarding the frames to the right MAC addresses again, the attacker has installed a stealthy MITM (Man-in-the- Middle) gateway, capable of intercepting all traffic between the two hosts.

8.2.3 Spoofing (beyond ARP)

In general, spoofing means sending bytes over the network with a falsified source address. Besides ARP packets, attackers may spoof any other type of net- work traffic. For instance, SMTP (Simple Mail Transfer Protocol) is a friendly, text-based protocol that is used everywhere for sending email. It uses the Mail From: header as an indication of the source of an email, but by default it does not check this for correctness of the email address. In other words, you can put anything you want in this header. All replies will be sent to this address. Incidentally, the content of the Mail From: header is not even shown to the recipient of the email message. In- stead, your mail client shows the content of a separate From: header. However, there is no check on this field either, and SMTP allows you to falsify it, so that the email that you send to your fellow students informing them that they failed the course ap- pears to have been sent by the course instructor. If you additionally set the Mail From: header to your own email address, all replies sent by panicking students will end up in your mailbox. What fun you will have! Less innocently, criminals frequently spoof email to send phishing emails from seemingly trusted sources. That email from ‘‘your doctor’’ telling you to click on the link below to get urgent information about your medical test may lead to a site that says everything is nor- mal, but fails to mention that it just downloaded a virus to your computer. The one from ‘‘your bank’’ can be bad for your financial health.

ARP spoofing occurs at the link layer, and SMTP spoofing at the application layer, but spoofing may happen at any layer in the protocol stack. Sometimes, spoofing is easy. For instance, anyone with the ability to craft custom packets can create fake Ethernet frames, IP datagrams, or UDP packets. You only need to change the source address and that is it: these protocols do not have any way to detect the tampering. Other protocols are much more challenging. For instance, in TCP connections the endpoints maintain state, such as the sequence and acknowl- edgement numbers, that make spoofing much trickier. Unless the attacker can sniff or guess the appropriate sequence numbers, the spoofed TCP segments will be re jected by the receiver as ‘‘out-of-window.’’ As we shall see later, there are substan tial other difficulties as well.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 745

Even the simple protocols allow attackers to cause a lot of damage. Shortly, we will see how spoofed UDP packets may lead to devastating DoS denial-of-Service attacks. First, however, we consider how spoofing permits attackers to intercept what clients send to a server by spoofing UDP datagrams in DNS.

DNS Spoofing

Since DNS uses UDP for its requests and replies, spoofing should be easy. For instance, just like in the ARP spoofing attack, we could wait for a client to send a lookup request for domain trusted-services.com and then race with the legitimate domain name system to provide a false reply that informs the client that trust- ed-services.com is located at an IP address owned by us. Doing so is easy if we can sniff the traffic coming from the client (and, thus, see the DNS lookup request to which to respond), but what if we cannot see the request? After all, if we can al ready sniff the communication, intercepting it via DNS spoofing is not that useful. Also, what if we want to intercept the traffic of many people instead of just one?

The simplest solution, if attackers share the local name server of the victim, is that they send their own request for, say, trusted-services.com, which in turn will trigger the local name server to do a lookup for this IP address on their behalf by contacting the next name server in the lookup process. The attackers immediately ‘‘reply’’ to this request by the local name server with a spoofed reply that appears to come from the next name server. The result is that the local name server stores the falsified mapping in its cache and serves it to the victim when it finally does the lookup for trusted-services.com (and anyone else who may be looking up the same name). Note that even if the attackers do not share the local name, the attack may still work, if the attacker can trick the victim into doing a lookup request with the attacker-provided domain name. For instance, the attacker could send an email that urges the victim to click on a link, so that the browser will do the name lookup for the attacker. After poisoning the mapping for trusted-services.com, all subse- quent lookups for this domain will return the false mapping.

The astute reader will object that this is not so easy at all. After all, each DNS request carries a 16-bit query ID and a reply is accepted only if the ID in the reply matches. But if the attackers cannot see the request, they have to guess the identi fier. For a single reply, the odds of getting it right is one in 65,536. On average, an attacker would have to send tens of thousands of DNS replies in a very short time, to falsify a single mapping at the local name server, and do so without being noticed. Not easy.

Birthday Attack

There is an easier way that is sometimes referred to as a birthday attack (or birthday paradox, even though strictly speaking it is not a paradox at all). The idea for this attack comes from a technique that math professors often use in their

746 NETWORK SECURITY CHAP. 8

probability courses. The question is: how many students do you need in a class be fore the probability of having two people with the same birthday exceeds 50%? Most of us expect the answer to be way over 100. In fact, probability theory says it is just 23. With 23 people, the probability of none of them having the same birth- day is:

365

365 ×364

365 ×363

365 × . . . ×343

365 = 0. 497203

In other words, the probability of two students celebrating their birthday on the same day is over 50%.

More generally, if there is some mapping between inputs and outputs with n inputs (people, identifiers, etc.) and k possible outputs (birthdays, identifiers, etc.), there are n(n < 1)/2 input pairs. If n(n < 1)/2 > k, the chance of having at least one match is pretty good. Thus, approximately, a match is likely for n > 3}2}k. The key is that rather than look for a match for one particular student’s birthday, we compare everyone to everyone else and any match counts.

Using this insight, the attackers first send a few hundred DNS requests for the domain mapping they want to falsify. The local name server will try to resolve each of these requests individually by asking the next-level name server. This is perhaps not very smart, because why would you send multiple queries for the same domain, but few people have argued that name servers are smart, and this is how the popular BIND name server operated for a long time. Anyway, immediately after sending the requests, the attackers also send hundreds of spoofed ‘‘replies’’ for the lookup, each pretending to come from the next-level name server and carry ing a different guess for the query ID. The local name server implicitly performs the many-to-many comparison for us because if any reply ID matches that of a re- quest sent by the local name server, the reply will be accepted. Note how this scen- ario resembles that of the students’ birthdays: the name server compares all re- quests sent by the local name server with all spoofed replies.

By poisoning the local name server for a particular Web site, say, the attackers obtain access to the traffic sent to this site for all clients of the name server. By set ting up their own connections to the Web site and then relaying all communication from the clients and all communication from the server, they now serve as a stealthy man-in-the-middle.

Kaminsky Attack

Things may get even worse when attackers poison the mapping not just for a single Web site, but for an entire zone. The attack is known as Dan Kaminsky’s DNS attack and it caused a huge panic among information security officers and network administrators the world over. To see why everybody got their knickers in a twist, we should go into DNS lookups in a little more detail.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 747

Consider a DNS lookup request for the IP address of www.cs.vu.nl. Upon reception of this request, the local name server, in turn, sends a request either to the root name server or, more commonly, to the TLD (top-level domain) name server for the .nl domain. The latter is more common because the IP address of the TLD name server is often already in the local name server’s cache. Figure 8-3 shows this request by the local name server (asking for an ‘‘A record’’ for the domain) in a recursive lookup with query 1337.

UDP source port = x UDP destination port = 53

Transaction ID = 1337 Number of question = 1

Flags The flags indicate things like:

this is a standard query and

recursion is desired (RD = 1)

What is the A record of www.cs.vu.nl?

Figure 8-3. A DNS request for www.cs.vu.nl.

The TLD server does not know the exact mapping, but does know the names of the DNS servers of Vrije Universiteit which it sends back in a reply, since it does not do recursive lookups, thank you very much. The reply, shown in Fig. 8-4 has a few interesting fields to discuss. First, we observe, without going into details, that the flags indicate explicitly that the server does not want to do recursive lookups, so the remainder of the lookup will be iterative. Second, the query ID of the reply is also 1337, matching that of the lookup. Third, the reply provides the symbolic names of the name servers of the university ns1.vu.nl and ns2.vu.nl as NS records. These answers are authoritative and, in principle, suffice for the local name server to complete the query: by first performing a lookup for the A record of one of the name servers and subsequently contacting it, it can ask for the IP address of www.cs.vu.nl. However, doing so means that it will first contact the same TLD name server again, this time to ask for the IP address of the university’s name ser- ver, and as this incurs an extra round trip time, it is not very efficient. To avoid this extra lookup, the TLD name server helpfully provides the IP addresses of the two university name servers as additional records in its reply, each with a short TTL. These additional records are known as DNS glue records and are the key to the Kaminsky attack.

Here is what the attackers will do. First, they send lookup requests for a non- existing subdomain of the university domain like:: ohdeardankaminsky.vu.nl. Since the subdomain does not exist, no name server can provide the mapping from its

748 NETWORK SECURITY CHAP. 8

UDP source port = 53 UDP destination port = x

(same as in request!)

Transaction ID = 1337 Number of question = 1

Flags

The reply flags may indicate that this is a reply and recursion is not possible (RA = 0)

Number of answers = 0

Number of resource records of authoritative servers = 2Number of resource records with additional info = 2

What is the A record of www.cs.vu.nl?

Authoritative server: ns1.vu.nl

Authoritative server: ns2.vu.nl

Additional/glue record: ns1.vu.nl 130.37.129.4

Additional/glue record: ns2.vu.nl 130.37.129.5

Figure 8-4. A DNS reply sent by the TLD name server.

cache. The local name server will instead contact the TLD name server. Im- mediately after sending the requests, the attackers also send many spoofed replies, pretending to be from the TLD name server, just like in a regular DNS spoofing re- quest, except this time, the reply indicates that the TLD name server does not know the answer (i.e., it does not provide the A record), does not do recursive lookups, and advises the local name server to complete the lookup by contacting one of the university name servers. It may even provide the real names of these name servers. The only things they falsify are the glue records, for which they supply IP ad- dresses that they control. As a result, every lookup for any subdomain of .vu.nl will contact the attackers’ name server which can provide a mapping to any IP address it wants. In other words, the attackers are able to operate as man-in-the-middle for any site in the university domain!

While not all name server implementations were vulnerable to this attack, most of them were. Clearly, the Internet had a problem. An emergency meeting was hastily organized in Microsoft’s headquarters in Redmond. Kaminsky later stated that all of this was shrouded in such secrecy that ‘‘there were people on jets to Microsoft who didn’t even know what the bug was.’’

So how did these clever people solve the problem? The answer is, they didn’t, not really. What they did do is make it harder. Recall that a core problem of these DNS spoofing attacks is that the query ID is only 16 bits, making it possible to guess it, either directly or by means of a birthday attack. A larger query ID makes the attack much less likely to succeed. However, simply changing the format of the DNS protocol message is not so easy and would also break many existing systems.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 749

The solution was to extend the length of the random ID without really extending the query ID, by instead introducing randomness also in the UDP source port. When sending out a DNS request to, say, the TLD name server, a patched name server would pick a random port out of thousands of possible port numbers and use that as the UDP source port. Now the attacker must guess not just the query ID, but also the port number and do so before the legitimate reply arrives. The 0x20 encod ing that we described in Chap. 7 exploits the case-insensitive nature of DNS queries to add even more bits to the transaction ID.

Fortunately, DNSSEC, provides a more solid defense against DNS spoofing. DNSSEC consists of a collection of extensions to DNS that offer both integrity and origin authentication of DNS data to DNS clients. However, DNSSEC deployment has been extremely slow. The initial work on DNSSEC was conducted in the early 1990s and the first RFC was published by the IETF in 1997; DNSSEC is now start ing to see more widespread deployment, as we will discuss later in this chapter.

TCP Spoofing

Compared to the protocols discussed so far, spoofing in TCP is infinitely more complicated. When attackers want to pretend that a TCP segment came from an- other computer on the Internet, they not only have to guess the port number, but also the correct sequence numbers. Moreover, keeping a TCP connection in good shape, while injecting spoofed TCP segments is very complicated. We distinguish between two cases:

1. Connection spoofing. The attacker sets up a new connection, pre tending to be someone at a different computer.

2. Connection hijacking. The attacker injects data in a connection that already exists between two parties, pretending to be either of these two parties.

The best-known example of TCP connection spoofing was the attack by Kevin Mitnick against the San Diego Supercomputing Center (SDSC) on Christ- mas day 1994. It is one of the most famous hacks in history, and the subject of sev- eral books and movies. Incidentally, one of them is a fairly big-budget flick called ‘‘Takedown,’’ that is based on a book that was written by the system administrator of the Supercomputing Center. (Perhaps not surprisingly, the administrator in the movie is portrayed as a very cool guy). We discuss it here because it illustrates the difficulties in TCP spoofing quite well.

Kevin Mitnick had a long history of being an Internet bad boy before he set his sights on SDSC. Incidentally, attacking on Christmas day is generally a good idea because on public holidays there are fewer users and administrators around. After some initial reconnaissance, Mitnick discovered that an (X-terminal) computer in SDSC had a trust relationship with another (server) machine in the same center.

750 NETWORK SECURITY CHAP. 8

Fig. 8-5(a) shows the configuration. Specifically, the server was implicitly trusted and anyone on the server could log in on the X-terminal as administrator using re- mote shell (rsh) without the need to enter a password. His plan was to set up a TCP connection to the X-terminal, pretending to be the server and use it to turn off pass- word protection altogether—in those days, this could be done by writing ‘‘+ +’’ in the .rhosts file.

Doing so, however, was not easy. If Mitnick had sent a spoofed TCP con- nection setup request (a SYN segment) to the X-terminal with the IP address of the server (step 1 in Fig. 8-5(b)), the X-terminal would have sent its SYN/ACK reply to the actual server, and this reply would have been invisible to Mitnick (step 2 in Fig. 8-5(b)). As a result, he would not know the X-terminal’s initial sequence number (ISN), a more-or-less random number that he would need for the third phase of the TCP handshake (which as we saw earlier, is the first segment that may contain data). What is worse, upon reception of the SYN/ACK, the server would have immediately responded with an RST segment to terminate the connection set- up (step 3 in Fig. 8-5(c)). After all, there must have been a problem, as it never

sent a SYN segment.

(Server can login

without password)

Trusted

X-terminal

server

Mitnick

Not visible to Mitnick

2. SYN/ACK

Trusted

server X-terminal 1. Spoofed SYN

Mitnick

Terminate handshake

3. RST

Trusted

server X-terminal Mitnick

(a) (b) (c)

Figure 8-5. Challenges faced by Kevin Mitnick during the attack on SDSC.

Note that the problem of the invisible SYN/ACK, and hence the missing initial sequence number (ISN), would not be a problem at all if the ISN would have been predictable. For instance, if it would start at 0 for every new connection. However, since the ISN was chosen more or less random for every connection, Mitnick need- ed to find out how it was generated in order to predict the number that the X-termi- nal would use in its invisible SYN/ACK to the server.

To overcome these challenges, Mitnick launched his attack in several steps. First, he interacted extensively with the X-terminal using nonspoofed SYN mes- sages (step 1 in Fig. 8-6(a)). While these TCP connection attempts did not get him access to the machine, they did give him a sequence of ISNs. Fortunately for Kevin, the ISNs were not that random. He stared at the numbers for a while until he found a pattern and was confident that given one ISN, he would be able to pre- dict the next one. Next, he made sure that the trusted server would not be able to reset his connection attempts by launching a DoS attack that made the server unre- sponsive (step 2 in Fig. 8-6(b)). Now the path was clear to launch his real attack.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 751

After sending the spoofed SYN packet (step 3 in Fig. 8-6(b)), he predicted the se- quence number that the X-terminal would be using in its SYN/ACK reply to the server (step 4 in Fig. 8-6(b)) and used this in the third and final step, where he sent the command echo ‘‘+ +’’ >> .rhosts as data to the port used by the remote shell daemon (step 5 in Fig. 8-6(c)). After that, he could log in from any machine with- out a password.

(No RST)

Trusted

server X-terminal

4. SYN/ACK

Trusted

server X-terminal

Trusted

server X-terminal

1. Guess ISN5. Third phase of TCP handshake with guessed ACK number and

3. Spoofed

2. KILL! KILL!

Mitnick

SYN

Mitnick

KILL!

data: echo + + >> .rhosts Mitnick

(a) (b) (c)

Figure 8-6. Mitnick’s attack

Since one of the main weaknesses exploited by Mitnick was the predictability of TCP’s initial sequence numbers, the developers of network stacks have since spent much effort on improving the randomness of TCP’s choice for these securi ty-sensitive numbers. As a result, the Mitnick attack is no longer practical. Modern attackers need to find a different way to guess the initial sequence numbers, for instance, the one employed in the connection hijacking attack we describe no

TCP Connection Hijacking

Compared to connection spoofing, connection hijacking adds even more hur- dles to overcome. For now, let us assume that the attackers are able to eavesdrop on an existing connection between two communicating parties (because they are on the same network segment) and therefore know the exact sequence numbers and all other relevant information related to this communication. In a hijacking attack, the aim is to take over an existing connection, by injecting data into the stream.

To make this concrete, let us assume that the attacker wants to inject some data into the TCP connection that exists between a client who is logged in to a Web ap- plication at a server with the aim of making either the client or server receive at tacker-injected bytes. In our example, the sequence numbers of the last bytes sent by the client and server are 1000 and 12,500, respectively. Assume that all data re- ceived so far have been acknowledged and the client and server are not currently sending any data. Now the attacker injects, say, 100 bytes into the TCP stream to the server, by sending a spoofed packet with the client’s IP address and source port, as well as the server’s IP address and source port. This 4-tuple is enough to make the network stack demultiplex the data to the right socket. In addition, the attacker provides the appropriate sequence number (1001) and acknowledgement number (12501), so TCP will pass the 100-byte payload to the Web server.

752 NETWORK SECURITY CHAP. 8

However, there is a problem. After passing the injected bytes to the applica tion, the server will acknowledge them to the client: ‘‘Thank you for the bytes, I am now ready to receive byte number 1101.’’ This message comes as a surprise to the client, who thinks the server is confused. After all, it never sent any data, and still intends to send byte 1001. It promptly tells the server so, by sending an empty segment with sequence number 1001 and acknowledgement number 12501. ‘‘Wow’’ says the server, ‘‘thanks, but this looks like an old ACK. By now, I already received the next 100 bytes. Best tell the remote party about this.’’ It resends the ACK (seq = 1101, ack = 12501), which leads to another ACK by the client, and so on. This phenomenon is known as an ACK storm. It will never stop until one of the ACKs gets lost (because TCP does not retransmit dataless ACKs).

How does the attacker quell the ACK storm? There are several tricks and we will discuss all of them. The simplest one is to tear down the connection explicitly by sending an RST segment to the communicating parties. Alternatively, the at tacker may be able to use ARP poisoning to cause one of the ACKs to be sent to a nonexisting address, forcing it to get lost. An alternative strategy is to desynchro- nize the two sides of the connection so much that all data sent by the client will be ignored by the server and vice versa. Doing so by sending lots of data is quite in- volved, but an attacker can easily accomplish this at the connection setup phase. The idea is as follows. The attacker waits until the client sets up a connection to the server. As soon as the server replies with a SYN/ACK, the attacker sends it an RST packet to terminate the connection, immediately followed by a SYN packet, with the same IP address and TCP source port as the ones originally used by the client, but a different client-side sequence number. After the subsequent SYN/ACK by the server, the server and client are both in the established state, but they cannot com- municate with each other, because their sequence numbers are so far apart that they are always out-of-window. Instead, the attacker plays the role of man-in-the-middle and relays data between the two parties, able to inject data at will.

Off-Path TCP Exploits

Some of the attacks are very complex and hard to even understand, let alone defend against. In this section we will look at one of the more complicated ones. In most cases, attackers are not on the same network segment and cannot sniff the traffic between the parties. Attacks in such a scenario are known as off-path TCP exploits and are very tricky to pull off. Even if we ignore the ACK storm, the at tacker needs a lot of information to inject data into an existing connection:

1. Even before the actual attack, the attackers should discover that there is a connection between two parties on the Internet to begin with.

2. Then they should determine the port numbers to use.

3. Finally, they need the sequence numbers.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 753

Quite a tall order, if you are on the other side of the Internet, but not necessar ily impossible, though. Decades after the Mitnick attack on SDSC, security re- searchers discovered a new vulnerability that permitted them to perform an off-

path TCP exploit on widely deployed Linux systems. They described their attack in a paper titled ‘‘Off-Path TCP Exploits: Global Rate Limit Considered Dangerous,’’ which is a very apt title, as we shall see. We discuss it here because it illustrates that secret information can sometimes leak in an indirect way.

Ironically, the attack was made possible by a novel feature that was supposed to make the system more secure, not less secure. Recall that we said off-path data injections were very difficult because the attacker had to guess the port numbers and the sequence numbers and getting this right in a brute force attack is unlikely. Still, you just might get it right. Especially since you do not even have to get the sequence number exactly right, as long as the data you send is ‘‘in-window.’’ This means that with some (small) probability, attackers may reset, or inject data into existing connections. In August 2010, a new TCP extension appeared in the form of RFC 5961 to remedy this problem.

RFC 5961 changed how TCP handled the reception of SYN segments, RST segments, and regular data segments. The reason that the vulnerability existed only in Linux is that only Linux implemented the RFC correctly. To explain what it did, we should consider first how TCP worked before the extension. Let us consid- er the reception of SYN segments first. Before RFC 5961, whenever TCP received a SYN segment for an already existing connection, it would discard the packet if it was out-of-window, but it would reset the connection if it was in-window. The rea- son is that upon receiving a SYN segment, TCP would assume that the other side had restarted and thus that the existing connection was no longer valid. This is not good, as an attacker only needs to get one SYN segment with a sequence number somewhere in the receiver window to reset a connection. What RFC 5961 propos- ed instead was to not reset the connection immediately, but first send a challenge ACK to the apparent sender of the SYN. If the packet did come from the legitimate remote peer, it means that it really did lose the previous connection and is now set ting up a new one. Upon receiving the challenge ACK, it will therefore send an RST packet with the correct sequence number. The attackers cannot do this since they never received the challenge ACK.

The same story holds for RST segments. In traditional TCP, hosts would drop the RST packets if they are out-of-window, and reset the connection if they are in-window. To make it harder to reset someone else’s connection, RFC 5961 pro- posed to reset the connection immediately only if the sequence number in the RST segment was exactly the one at the start of the receiver window (i.e., next expected sequence number). If the sequence number is not an exact match, but still in-win- dow, the host does not drop the connection, but sends a challenge ACK. If the sender is legitimate, it will send a RST packet with the right sequence number.

Finally, for data segments, old-style TCP conducts two checks. First, it checks the sequence number. If that was in-window, it also checks the acknowledgement

754 NETWORK SECURITY CHAP. 8

number. It considers acknowledgement numbers valid as long as they fall in an (enormous) interval. Let us denote the sequence numbers of the first unacknow ledged byte by FUB and the sequence number of the next byte to be sent by NEXT. All packets with acknowledgement numbers in [FUB < 2GB, NEXT] are valid, or half the ACK number space. This is easy to get right for an attacker! Moreover, if the acknowledgement number also happens to be in-window, it would process the data and advance the window in the usual way. Instead, RFC 5961 says that while we should accept packets with acknowledgement numbers that are (roughly) in-window, we should send challenge ACKs for the ones that are in the window [FUB < 2GB, FUB < MAXWIN], where MAXWIN is the largest window ever advertised by the peer.

The designers of the protocol extension quickly recognized that it may lead to a huge number of challenge ACKs, and proposed ACK throttling as a solution. In the implementation of Linux, this meant that it would send at most 100 challenge ACKs per second, across all connections. In other words, a global variable shared by all connections kept track of how many challenge ACKs were sent and if the counter reached 100, it would send no more challenge ACKs for that one-second interval, whatever happened.

All this sounds good, but there is a problem. A single global variable repres- ents shared state that can serve as a side channel for clever attacks. Let us take the first obstacle the attackers must overcome: are the two parties communicating? Recall that a challenge ACK is sent in three scenarios:

1. A SYN segment has the right source and destination IP addresses and port numbers, regardless of the sequence number.

2. A RST segment where the sequence number is in-window.

3. A data segment where additionally the acknowledgement number is in the challenge window.

Let us say that the attackers want to know whether a user at 130.37.20.7 is talking to a Web server (destination port 80) at 37.60.194.64. Since the attackers need not get the sequence number right, they only need to guess the source port number. To do so, they set up their own connection to the Web server and send 100 RST pack- ets in quick succession, in response to which the server sends 100 challenge ACKs, unless it has already sent some challenge ACKs, in which case it would send fewer. However, this is quite unlikely. In addition to the 100 RSTs, the attackers therefore send a spoofed SYN segment, pretending to be the client at 130.37.20.7, with a guessed port number. If the guess is wrong, nothing happens and the attackers will still receive the 100 challenge ACKs. However, if they guessed the port number correctly, we end up in scenario (1), where the server sends a challenge ACK to the legitimate client. But since the server can only send 100 challenge ACKs per sec- ond, this means that the attackers receive only 99. In other words, by counting the number of challenge ACKs, the attackers can determine not just that the two hosts

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 755

are communicating, but even the (hidden) source port number of the client. Of course, you need quite a few tries to get it right, but this is definitely doable. Also, there are various techniques to make this more efficient.

Once the attackers have the port number they can move to the next phase of the attack: guessing the sequence and acknowledgement numbers. The idea is quite similar. For the sequence number the attackers again send 100 legitimate RST packets (spurring the server into sending challenge ACKs) and an additional spoof- ed RST packet with the right IP addresses and now known port numbers, as well as a guessed sequence number. If the guess is in-window, we are in scenario 2. Thus, by counting the challenge ACKs the attackers receive, they can determine whether the guess was correct.

Finally, for the acknowledgement number they send, in addition to the 100 RST packets, a data packet with all fields filled in correctly, but with a guess for the acknowledgement number, and apply the same trick. Now the attackers have all the information they need to reset the connection, or inject data.

The off-path TCP attack is a good illustration of three things. First, it shows how crazy complicated network attacks may get. Second, it is an excellent example of a network-based side-channel attack. Such attacks leak important information in an indirect way. In this case, the attackers learned all the connection details by counting something that appears very unrelated. Third, the attack shows that global shared state is the core problem of such side-channel attacks. Side-channel vulner- abilities appear everywhere, in both software and hardware, and in all cases, the root cause is the sharing of some important resource. Of course, we knew this al ready, as it is a violation of Saltzer and Schroeder’s general principle of least com- mon mechanism which we discussed in the beginning of this chapter. From a se- curity perspective, it is good to remember that often sharing is not caring!

Before we move to the next topic (disruption and denial of service), it is good to know that data injection is not just nice in theory, it is actively used in practice. After the revelations by Edward Snowden in 2013, it became clear that the NSA (National Security Agency) ran a mass surveillance operation. One of its activities was Quantum, a sophisticated network attack that used packet injection to redirect targeted users connecting to popular services (such as Twitter, Gmail, or Facebook) to special servers that would then hack the victims’ computers to give the NSA complete control. NSA denies everything, of course. It almost even denies its own existence. An industry joke goes:

Q: What does NSA stand for?

A: No Such Agency

8.2.4 Disruption

Attacks on availability are known as denial-of-service" attacks. They occur when a victim receives data it cannot handle, and as a result, becomes unrespon- sive. There are various reasons why a machine may stop responding:

756 NETWORK SECURITY CHAP. 8

1. Crashes. The attacker sends content that causes the victim to crash or hang. An example of such an attack was the ping of death we dis- cussed earlier.

2. Algorithmic complexity. The attacker sends data that is crafted spe- cifically to create a lot of (algorithmic) overhead. Suppose a server al lows clients to send rich search queries. In that case, an algorithmic complexity attack may consist of a number of complicated regular expressions that incur the worst-case search time for the server.

3. Flooding/swamping. The attacker bombards the victim with such a massive flood of requests or replies that the poor system cannot keep up. Often, but not always, the victim eventually crashes.

Flooding attacks have become a major headache for organizations because these days it is very easy and cheap to carry out large-scale DoS attacks. For a few dollars or euros, you can rent a botnet consisting of many thousands of machines to attack any address you like. If the attack data is sent from a large number of dis tributed machines, we refer to the attack as a DDoS, (Distributed Denial-of-Ser- vice) attack. Specialized services on the Internet, known as booters or stressers, offer user-friendly interfaces to help even nontechnical users to launch them.

SYN Flooding

In the old days, DDoS attacks were quite simple. For instance, you would use a large number of hacked machines to launch a SYN flooding attack. All of these machines would send TCP SYN segments to the server, often spoofed to make it appear as if they came from different machines. While the server responded with a SYN/ACK, nobody would complete the TCP handshake, leaving the server dan- gling. That is quite expensive. A host can only keep a limited number of con- nections in the half-open state. After that, it no longer accepts new connections.

There are many solutions for SYN flooding attacks. For instance, we may sim- ply drop half-open connections when we reach a limit to give preference to new connections or reduce the SYN-received timeout. An elegant and very simple solution, supported by many systems today goes by the name of SYN cookies, also briefly discussed in Chap. 6. Systems protected with SYN cookies use a special algorithm to determine the initial sequence number in such a way that the server does not need to remember anything about a connection until it receives the third packet in the three-way handshake. Recall that a sequence number is 32 bits wide. With SYN cookies, the server chooses the initial sequence number as follows:

1. The top 5 bits are the value of t modulo 32, where t is a slowly incre- menting timer (e.g., a timer that increases every 64 seconds).

2. The next 3 bits are an encoding of the MSS (maximum segment size), giving eight possible values for the MSS.

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 757

3. The remaining 24 bits are the value of a cryptographic hash over the timestamp t and the source and destination IP addresses and port numbers.

The advantage of this sequence number is that the server can just stick it in a SYN/ACK and forget about it. If the handshake never completes, it is no skin off its back (or off whatever it is the server has on its back). If the handshake does complete, containing its own sequence number plus one in the acknowledgement, the server is able to reconstruct all the state it requires to establish the connection. First, it checks that the cryptographic hash matches a recent value of t and then quickly rebuilds the SYN queue entry using the MSS encoded in the 3 bits. While SYN Cookies allow only eight different segment sizes and make the sequence number grow faster than usual, the impact is minimal in practice. What is particu larly nice is that the scheme is compatible with normal TCP and does not require the client to support the same extension.

Of course, it is still possible to launch a DDoS attack even in the presence of SYN cookies by completing the handshake, but this is more expensive for the at tackers (as their own machines have limits on open TCP connections also), and more importantly, prevents TCP attacks with spoofed IP addresses.

Reflection and Amplification in DDoS Attacks

However, TCP-based DDoS attacks are not the only game in town. In recent years, more and more of the large-scale DDoS attacks have used UDP as the tran- sport protocol. Spoofing UDP packets is typically easy. Moreover, with UDP it is possible to trick legitimate servers on the Internet to launch so-called reflection attacks on a victim. In a reflection attack, the attacker sends a request with a spoofed source address to a legitimate UDP service, for instance, a name server. The server will then reply to the spoofed address. If we do this from a large num- ber of servers, the deluge of UDP reply packets is more than likely to take down the victim. Reflection attacks have two main advantages.

1. By adding the extra level of indirection, the attacker makes it difficult for the victim to block the senders somewhere in the network (after all, the senders are all legitimate servers).

2. Many services can amplify the attack by sending large replies to small requests.

These amplification-based DDoS attacks have been responsible for some of the largest volumes of DDoS attack traffic in history, easily reaching into the Terabit-per-second range. What the attacker must do for a successful amplification

attack is to look for publicly accessible services with a large amplification factor. For instance, where one small request packet becomes a large reply packet, or

758 NETWORK SECURITY CHAP. 8

better still, multiple large reply packets. The byte amplification factor represents the relative gain in bytes, while the packet amplification factor represents the rela tive gain packets. Figure 8-7 shows the amplification factors for several popular protocols. While these numbers may look impressive, it is good to remember that these are averages and individual servers may have even higher ones. Interestingly, DNSSEC, the protocol that was intended to fix the security problems of DNS, has a much higher amplification factor than plain old DNS, exceeding 100 for some servers. Not to be outdone, misconfigured memcached servers (fast in-memory databases), clocked an amplification factor well exceeding 50,000 during a massive amplification attack of 1.7 Tbps in 2018.

Protocol Byte amplification Packet amplification

NTP 556.9 3.8

DNS 54.6 2.1

Bittorrent 3.8 1.6

Figure 8-7. Amplification factors for popular protocols

Defending against DDoS Attacks

Defending against such enormous streams of traffic is not easy, but several defenses exist. One, fairly straightforward technique is to block traffic close to the source. The most common way to do so is using a technique called egress filter ing, whereby a network device such as a firewall blocks all outgoing packets whose source IP addresses do not correspond to those inside the network where it is attached. This, of course, requires the firewall to know what packets could possi- bly arrive with a particular source IP address, which is typically only possible at the edge of the network; for example, a university network might know all IP address ranges on its campus network and could thus block outgoing traffic from any IP address that it did not own. The dual to egress filtering is ingress filtering, whereby a network device filters all incoming traffic with internal IP addresses.

Another measure we can take is to try and absorb the DDoS attack with spare capacity. Doing so is expensive and may be unaffordable on an individual basis, for all but the biggest players. Fortunately, there is no reason to do this individually. By pooling resources that can be used by many parties, even smaller players can afford DDoS protection. Like insurance, the assumption is that not everybody will be attacked at the same time.

So what insurance will you get? Several organizations offer to protect your Web site by means of cloud-based DDoS protection which uses the strength of the cloud, to scale up capacity as and when needed, to fend off DoS attacks. At its core, the defense consists of the cloud shielding and even hiding the IP address of the real server. All requests are sent to proxies in the cloud that filter out the

SEC. 8.2 THE CORE INGREDIENTS OF AN ATTACK 759

malicious traffic the best they can (although doing so may not be so easy for ad- vanced attacks), and forward the benign requests to the real server. If the number of requests or the amount of traffic for a specific server increases, the cloud will al locate more resources to handling these packets. In other words, the cloud ‘‘absorbs’’ the flood of data. Typically, it may also operate as a scrubber to sani tize the data as well. For instance, it may remove overlapping TCP segments or weird combinations of TCP flags, and serve in general as a WAF (Web Applica tion Firewall).

To relay the traffic via the cloud-based proxies Web site owners can choose be tween several options with different price tags. If they can afford it, they can opt for BGP blackholing. In this case, the assumption is that the Web site owner con trols an entire /24 block of (16,777,216) addresses. The idea is that the owner sim- ply withdraws the BGP announcements for that block from its own routers. In- stead, the cloud-based security provider starts announcing this IP from its network, so that all traffic for the server will go to the cloud first. However, not everybody has entire network blocks to play around with, or can afford the cost of BGP rerouting. For them, there is the more economical option to use DNS rerouting. In this case, the Web site’s administrators change the DNS mappings in their name servers to point to servers in the cloud, rather than the real server. In either case, visitors will send their packets to the proxies owned by the cloud-based security provider first and these cloud-based proxies subsequently forward the packets to the real server.

DNS rerouting is easier to implement, but the security guarantees of the cloud- based security provider are only strong if the real IP address of the server remains hidden. If the attackers obtain this address, they can bypass the cloud and attack the server directly. Unfortunately, there are many ways in which the IP address may leak. Like FTP, some Web applications send the IP address to the remote party in-band, so there is not a lot one could do in those cases. Alternatively, attackers could look at historical DNS data to see what IP addresses were registered for the server in the past. Several companies collect and sell such historical DNS data.

8.3 FIREWALLS AND INTRUSION DETECTION SYSTEMS

The ability to connect any computer, anywhere, to any other computer, any- where, is a mixed blessing. For individuals at home, wandering around the Internet is lots of fun. For corporate security managers, it is a nightmare. Most companies have large amounts of confidential information online—trade secrets, product de- velopment plans, marketing strategies, financial analyses, tax records, etc. Disclo- sure of this information to a competitor could have dire consequences.

In addition to the danger of information leaking out, there is also a danger of information leaking in. In particular, viruses, worms, and other digital pests can breach security, destroy valuable data, and waste large amounts of administrators’

760 NETWORK SECURITY CHAP. 8

time trying to clean up the mess they leave. Often they are imported by careless employees who want to play some nifty new game.

Consequently, mechanisms are needed to keep ‘‘good’’ bits in and ‘‘bad’’ bits out. One method is to use encryption, which protects data in transit between secure sites. However, it does nothing to keep digital pests and intruders from get ting onto the company’s LAN. To see how to accomplish this goal, we need to look at firewalls.

8.3.1 Firewalls

Firewalls are just a modern adaptation of that old medieval security standby: digging a wide and deep moat around your castle. This design forced everyone entering or leaving the castle to pass over a single drawbridge, where they could be inspected by the I/O police. With networks, the same trick is possible: a company can have many LANs connected in arbitrary ways, but all traffic to or from the company is forced through an electronic drawbridge (firewall), as shown in Fig. 8-8. No other route exists.

Internal network DeMilitarized zone External

Firewall

Security

perimeter

Web

Email

server

server

Internet

Figure 8-8. A firewall protecting an internal network.

The firewall acts as a packet filter. It inspects each and every incoming and outgoing packet. Packets meeting some criterion described in rules formulated by the network administrator are forwarded normally. Those that fail the test are unceremoniously dropped.

The filtering criterion is typically given as rules or tables that list sources and destinations that are acceptable, sources and destinations that are blocked, and de fault rules about what to do with packets coming from or going to other machines. In the common case of a TCP/IP setting, a source or destination might consist of an IP address and a port. Ports indicate which service is desired. For example, TCP port 25 is for mail, and TCP port 80 is for HTTP. Some ports can simply be blocked outright. For example, a company could block incoming packets for all IP addresses combined with TCP port 79. It was once popular for the Finger service

SEC. 8.3 FIREWALLS AND INTRUSION DETECTION SYSTEMS 761

to look up people’s email addresses but is barely used today due to its role in a now-infamous (accidental) attack on the Internet in 1988.

Other ports are not so easily blocked. The difficulty is that network adminis trators want security but cannot cut off communication with the outside world. That arrangement would be much simpler and better for security, but there would be no end to user complaints about it. This is where arrangements such as the DMZ (DeMilitarized Zone) shown in Fig. 8-8 come in handy. The DMZ is the part of the company network that lies outside of the security perimeter. Anything goes here. By placing a machine such as a Web server in the DMZ, computers on the Internet can contact it to browse the company Web site. Now the firewall can be configured to block incoming TCP traffic to port 80 so that computers on the In ternet cannot use this port to attack computers on the internal network. To allow the Web server to be managed, the firewall can have a rule to permit connections between internal machines and the Web server.

Firewalls have become much more sophisticated over time in an arms race with attackers. Originally, firewalls applied a rule set independently for each pack- et, but it proved difficult to write rules that allowed useful functionality but blocked all unwanted traffic. Stateful firewalls map packets to connections and use TCP/IP header fields to keep track of connections. This allows for rules that, for example, allow an external Web server to send packets to an internal host, but only if the internal host first establishes a connection with the external Web server. Such a rule is not possible with stateless designs that must either pass or drop all packets from the external Web server.

Another level of sophistication up from stateful processing is for the firewall to implement application-level gateways. This processing involves the firewall looking inside packets, beyond even the TCP header, to see what the application is doing. With this capability, it is possible to distinguish HTTP traffic used for Web browsing from HTTP traffic used for peer-to-peer file sharing. Administrators can write rules to spare the company from peer-to-peer file sharing but allow Web browsing that is vital for business. For all of these methods, outgoing traffic can be inspected as well as incoming traffic, for example, to prevent sensitive documents from being emailed outside of the company.

As the above discussion should make abundantly clear, firewalls violate the standard layering of protocols. They are network layer devices, but they peek at the transport and applications layers to do their filtering. This makes them fragile. For instance, firewalls tend to rely on standard port numbering conventions to de termine what kind of traffic is carried in a packet. Standard ports are often used, but not by all computers, and not by all applications either. Some peer-to-peer ap- plications select ports dynamically to avoid being easily spotted (and blocked). Moreover, encryption hides higher-layer information from the firewall. Finally, a firewall cannot readily talk to the computers that communicate through it to tell them what policies are being applied and why their connection is being dropped. It must simply pretend that it is a broken wire. For these reasons, networking purists

762 NETWORK SECURITY CHAP. 8

consider firewalls to be a blemish on the architecture of the Internet. However, the Internet can be a dangerous place if you are a computer. Firewalls help with that problem, so they are likely to stay.

Even if the firewall is perfectly configured, plenty of security problems still exist. For example, if a firewall is configured to allow in packets from only specif ic networks (e.g., the company’s other plants), an intruder outside the firewall can spoof the source addresses to bypass this check. If an insider wants to ship out secret documents, he can encrypt them or even photograph them and ship the pho tos as JPEG files, which bypasses any email filters. And we have not even dis- cussed the fact that, although three-quarters of all attacks come from outside the firewall, the attacks that come from inside the firewall, for example, from disgrun tled employees, may be the most damaging (Verizon, 2009).

A different problem with firewalls is that they provide a single perimeter of defense. If that defense is breached, all bets are off. For this reason, firewalls are often used in a layered defense. For example, a firewall may guard the entrance to the internal network and each computer may also run its own firewall, too. Reade rs who think that one security checkpoint is enough clearly have not made an inter- national flight on a scheduled airline recently. As a result, many networks now have multiple levels of firewall, all the way down to per-host firewalls—a simple example of defense in depth. Suffice it to say that in both airports and computer networks if attackers have to compromise multiple independent defenses, it is much harder for them to breach the entire system.

8.3.2 Intrusion Detection and Prevention

Besides firewalls and scrubbers, network administrators may deploy a variety of other defensive measures, such as intrusion detection systems and intrusion pre- vention systems, to be described shortly. As the name implies, the role of an IDS (Intrusion Detection System) is to detect attacks—ideally before they can do any damage. For instance, they may generate warnings early on, at the onset of an at tack, when it observes port scanning or a brute force ssh password attack (where an attacker simply tries many popular passwords to try and log in), or when the IDS finds the signature of the latest and greatest exploit in a TCP connection. However, it may also detect attacks only at a later stage, when a system has already been compromised and now exhibits unusual behavior.

We can categorize intrusion detection systems by considering where they work and how they work. A HIDS (Host-based IDS) works on the end-point itself, say a laptop or server, and scans, for instance, the behavior of the software or the net- work traffic to and from a Web server only on that machine. In contrast, a NIDS (Network IDS) checks the traffic for a set of machines on the network. Both have advantages and disadvantages.

A NIDS is attractive because it protects many machines, with the ability to cor relate events associated with different hosts, and does not use up resources on the

SEC. 8.3 FIREWALLS AND INTRUSION DETECTION SYSTEMS 763

machines it protects. In other words, the IDS has no impact on the performance of the machines in its protection domain. On the other hand, it is difficult to handle issues that are system specific. As an example, suppose that a TCP connection con tains overlapping TCP segments: packet A contains bytes 1–200 while packet B contains bytes 100–300. Clearly, there is overlap between the bytes in the pay loads. Let us also assume that the bytes in the overlapping region are different. What is the IDS to do?

The real question is: which bytes will be used by the receiving host? If the host uses the bytes of packet A, the IDS should check these bytes for malicious content and ignore the ones in packet B. However, what if the host instead uses the bytes in packet B? And what if some hosts in the network take the bytes in packet A and some take the bytes in packet B? Even if the hosts are all the same and the IDS knows how they reassemble the TCP streams there may still be difficulties. Sup- pose all hosts will normally take the bytes in packet A. If the IDS looks at that packet, it is still wrong if the destination of the packet is two or three network hops away, and the TTL value in packet A is 1, so it never even reaches its destination. Tricks that attackers play with TTL, or with overlapping byte ranges in IP frag- ments or TCP segments, are called IDS evasion techniques.

Another problem with NIDS is encryption. If the network bytes are no longer decipherable, it becomes much harder for the IDS to determine if they are mali- cious. This is another example of one security measure (encryption) reducing the protection offered by another (IDS). As a work-around, administrators may give the IDS the encryption keys to the NIDS. This works, but is not ideal, as it creates additional key management headaches. Also, observe that the IDS sees all the net- work traffic and tends to contain a great many lines of code itself. In other words, it may form a very juicy target for attackers. Break the IDS and you get access to all network traffic!

A host-based IDS’ drawbacks are that it uses resources at each machine on which it runs and that it sees only a small fraction of the events in the network. On the other hand, it does not suffer as much from evasion problems as it can check the traffic after it has been reassembled by the very network stack of the machine it

is trying to protect. Also, in cases such as IPsec, where packets encrypted and de- crypted in the network layer, the IDS may check the data after decryption. Beside the different locations of an IDS, we also have some choice in how an IDS determines whether something poses a threat. There are two main categories. Signature-based intrusion detection systems use patterns in terms of bytes or se- quences of packets that are symptoms of known attacks. If you know that a UDP packet to port 53 with 10 specific bytes at the start of the payload are part of an exploit E, an IDS can easily scan the network traffic for this pattern and raise an alert when it detects it. The alert is specific: (‘‘I have detected E’’) and has a high confidence (‘‘I know that it is E’’). However, with a signature-based IDS, you only detect threats that are known and for which a signature is available. Alternatively, an IDS may raise an alert if it sees unusual behavior. For instance, a computer that

764 NETWORK SECURITY CHAP. 8

normally only exchanges SMTP and DNS traffic with a few IP addresses, suddenly starts sending HTTP traffic to many completely unknown IP addresses outside the local network. An IDS may classify this as fishy. Since such anomaly-based intrusion detection systems, or anomaly detection systems for short, trigger on any abnormal behavior, they are capable of detecting new attacks as well as old ones. The disadvantage is that the alerts do not carry a lot of explanation. Hearing that ‘‘something unusual happened in the network’’ is much less specific and much less useful than learning that ‘‘the security camera at the gate is now attacked being by the Hajime malware.’’

An IPS (Intrusion Prevention System) should not only detect the attack, but also stop it. In that sense, it is a glorified firewall. For instance, when the IPS sees a packet with the Hajime signature it can drop it on the floor rather than allowing it to reach the security camera. To do so, the IPS should sit on the path towards the target and take decisions about accepting or dropping traffic ‘‘on the fly.’’ In con trast, an IDS may reside elsewhere in the network, as long as we mirror all the traf fic so it sees it. Now you may ask: why bother? Why not simply deploy an IPS and be done with the threats entirely? Part of the answer is the performance: the proc- essing at the IDS determines the speed of the data transfer. If you have very little time, you may not be able to analyze the data very deeply. More importantly, what if you get it wrong? Specifically, what if your IPS decides a connection contains an attack and drops it, even though it is benign? That is really bad if the connection is important, for example, when your business depends on it. It may be better to raise an alert and let someone look into it, to decide if it really was malicious.

In fact, it is important to know how often your IDS or IPS gets it right. If it raises too many false alerts (false positives) you may end up spending a lot of time and money chasing those. If, on the other hand, it plays conservative and often does not raise alerts when attacks do take place (false negatives), attackers may still easily compromise your system. The number of false positives (FPs) and false negatives (FNs) with respect to the true positives (TPs) and true negatives (TNs) determines the usefulness of your protection. We commonly express these proper ties in terms of precision and recall. Precision represents a metric that indicates how many of the alarms that you generated were justified. In mathematical terms: P = TP/(TP + FP). Recall indicates how many of the actual attacks you detected: R = TP/(TP + FN). Sometimes, we combine the two values in what is known as the F-measure: F = 2PR/(P + R). Finally, we are sometimes simply interested in how often an IDS or IPS got things right. In that case, we use the accuracy as a metric: A = (TP + TN)/total.

While it is always true that high values for recall and high precision are better than low ones, the number of false negatives and false positives are typically some- what inversely correlated: if one goes down, the other goes up. However, the trade- off for what acceptable ranges are varies from situation to situation. If you are the

Pentagon, you care deeply about not getting compromised. In that case, you may be willing to chase down a few more false positives, as long as you do not have

SEC. 8.3 FIREWALLS AND INTRUSION DETECTION SYSTEMS 765

many false negatives. If, on the other hand, you are a school, things may be less critical and you may choose to not spend your money on an administrator who spends most of his working days analyzing false alarms.

There is one final thing we need to explain about these metrics to make you appreciate the importance of false positives. We will use an analogy similar to the one introduced by Stefan Axelsson in an influential paper that explained why intru- sion detection is difficult (Axelsson, 1999). Suppose that there is a disease that affects 1 in 100,000 people in practice. Anyone diagnosed with the disease dies within a month. Fortunately, there is a great test to see if someone is infected. The test has 99% accuracy: if a patient is sick (S) the test will be positive (in the medi- cal world a positive test is a bad thing!) in 99% of the cases, while for healthy patients (H), the test will be negative (Neg) in 99% of the cases. One day you take the test and, blow me down, the test is positive (i.e., indicates Pos). The million dollar question: how bad is this? Phrased differently: should you say goodbye to friends and family, sell everything you own in a yard sale, and live a (short) life of debauchery for the remaining 30-odd days? Or not?

To answer this question we should look at the math. What we are interested in is the probability that you have the disease given that you tested positive: P(S|Pos). What we know is:

P(Pos|S) = 0. 99

P(Neg|H) = 0. 99

P(S) = 0. 00001

To calculate P(S|Pos), we use the famous Bayes theorem:

P(S|Pos) = P(S)P(Pos|S)

P(Pos)

In our case, there are only two possible outcomes for the test and two possible out- comes for you having the disease. In other words

P(Pos) = P(S)P(Pos|S) + P(H)P(Pos|H)

where P(H) = 1 < P(S),

and P(Pos|H) = 1 < P(Neg|H), so that:

P(Pos) = P(S)P(Pos|S) + (1 < P(S))(1 < P(Neg|H))

= 0. 00001 * 0. 99 + 0. 99999 * 0. 01

so that

P(S|Pos) =0. 00001 * 0. 99

0. 00001 * 0. 99 + 0. 99999 * 0. 01

= 0. 00098

766 NETWORK SECURITY CHAP. 8

In other words, the probability of you having the disease is less than 0. 1%. No need to panic yet. (Unless of course you did prematurely sell all your belongings in an estate sale.)

What we see here is that the final probability is strongly dominated by the false positive rate P(Pos|H) = 1 < P(Neg|H) = 0. 01. The reason is that the number of incidents is so small (0. 00001) that all the other terms in the equation hardly count. The problem is referred to as the Base Rate Fallacy. If we substitute ‘‘under attack’’ for ‘‘sick,’’ and ‘‘alert’’ for ‘‘positive test,’’ we see that the base rate fallacy is extremely important for any IDS or IPS solution. It motivates the need for keeping the number of false positives low.

Besides the fundamental security principles by Saltzer and Schroeder, many people have offered additional, often very practical principles. One that is particu larly useful to mention here is the pragmatic principle of defense in depth. Often it is a good idea to use multiple complementary techniques to protect a system. For instance, to stop attacks, we may use a firewall and an intrusion detection sys tem and a virus scanner. While no single measure may be foolproof by itself, the idea is that it is much harder to bypass all of them at the same time.

8.4 CRYPTOGRAPHY

Cryptography comes from the Greek words for ‘‘secret writing.’’ It has a long and colorful history going back thousands of years. In this section, we will just sketch some of the highlights, as background information for what follows. For a complete history of cryptography, Kahn’s (1995) book is recommended reading. For a comprehensive treatment of modern security and cryptographic algorithms, protocols, and applications, and related material, see Kaufman et al. (2002). For a more mathematical approach, see Kraft and Washington (2018). For a less mathe- matical approach, see Esposito (2018).

Professionals make a distinction between ciphers and codes. A cipher is a character-for-character or bit-for-bit transformation, without regard to the linguistic structure of the message. In contrast, a code replaces one word with another word or symbol. Codes are not used any more, although they have a glorious history.

The most successful code ever devised was used by the United States Marine Corps during World War II in the Pacific. They simply had Navajo Marines talking to each other in their native language using specific Navajo words for military terms, for example, chay-da-gahi-nail-tsaidi (literally: tortoise killer) for antitank weapon. The Navajo language is highly tonal, exceedingly complex, and has no written form. And not a single person in Japan knew anything about it. In September 1945, the San Diego Union published an article describing the previ- ously secret use of the Navajos to foil the Japanese, telling how effective it was. The Japanese never broke the code and many Navajo code talkers were awarded

SEC. 8.4 CRYPTOGRAPHY 767

high military honors for extraordinary service and bravery. The fact that the U.S. broke the Japanese code but the Japanese never broke the Navajo code played a crucial role in the American victories in the Pacific.

8.4.1 Introduction to Cryptography

Historically, four groups of people have used and contributed to the art of cryptography: the military, the diplomatic corps, diarists, and lovers. Of these, the military has had the most important role and has shaped the field over the cen turies. Within military organizations, the messages to be encrypted have tradition- ally been given to poorly paid, low-level code clerks for encryption and transmis- sion. The sheer volume of messages prevented this work from being done by a few elite specialists.

Until the advent of computers, one of the main constraints on cryptography had been the ability of the code clerk to perform the necessary transformations, often on a battlefield with little equipment. An additional constraint has been the difficulty in switching over quickly from one cryptographic method to another, since this entails retraining a large number of people. However, the danger of a code clerk being captured by the enemy has made it essential to be able to change the cryptographic method instantly if need be. These conflicting requirements have given rise to the model of Fig. 8-9.

Passive intruder just

listens

Intruder

Active intruder

can alter messages

Plaintext, P Plaintext, P Decryption Encryption

method, E

Ciphertext, C = EK(P)

Encryption

key, K

method, D

Decryption key, K

Figure 8-9. The encryption model (for a symmetric-key cipher).

The messages to be encrypted, known as the plaintext, are transformed by a function that is parametrized by a key. The output of the encryption process, known as the ciphertext, is then transmitted, often by messenger or radio. We as- sume that the enemy, or intruder, hears and accurately copies down the complete ciphertext. However, unlike the intended recipient, he does not know what the

768 NETWORK SECURITY CHAP. 8

decryption key is and so cannot decrypt the ciphertext easily. Sometimes the in truder can not only listen to the communication channel (passive intruder) but can also record messages and play them back later, inject his own messages, or modify legitimate messages before they get to the receiver (active intruder). The art of breaking ciphers, known as cryptanalysis, and the art of devising them (crypto- graphy) are collectively known as cryptology.

It will often be useful to have a notation for relating plaintext, ciphertext, and keys. We will use C = EK (P) to mean that the encryption of the plaintext P using key K gives the ciphertext C. Similarly, P = DK(C) represents the decryption of C to get the plaintext again. It then follows that

DK (EK (P)) = P

This notation suggests that E and D are just mathematical functions, which they are. The only tricky part is that both are functions of two parameters, and we have written one of the parameters (the key) as a subscript, rather than as an argument, to distinguish it from the message.

A fundamental rule of cryptography is that one must assume that the crypt- analyst knows the methods used for encryption and decryption. In other words, the cryptanalyst knows how the encryption method, E, and decryption, D, of Fig. 8-9 work in detail. The amount of effort necessary to invent, test, and install a new al- gorithm every time the old method is compromised (or thought to be compro- mised) has always made it impractical to keep the encryption algorithm secret. Thinking it is secret when it is not does more harm than good.

This is where the key enters. The key consists of a (relatively) short string that selects one of many potential encryptions. In contrast to the general method, which may only be changed every few years, the key can be changed as often as re- quired. Thus, our basic model is a stable and publicly known general method parametrized by a secret and easily changed key. The idea that the cryptanalyst knows the algorithms and that the secrecy lies exclusively in the keys is called Kerckhoffs’ principle, named after the Dutch-born military cryptographer Auguste Kerckhoffs who first published it in a military journal 1883 (Kerckhoffs, 1883). Thus, we have

Kerckhof s’ principle: all algorithms must be public; only the keys are secret.

The nonsecrecy of the algorithm cannot be emphasized enough. Trying to keep the algorithm secret, known in the trade as security by obscurity, never works. Also, by publicizing the algorithm, the cryptographer gets free consulting from a large number of academic cryptologists eager to break the system so they can publish papers demonstrating how smart they are. If many experts have tried to break the algorithm for a long time after its publication and no one has suc- ceeded, it is probably pretty solid. (On the other hand, researchers have found bugs in open source security solutions such as OpenSSL that were over a decade

SEC. 8.4 CRYPTOGRAPHY 769

old, so the common belief that ‘‘given enough eyeballs, all bugs are shallow’’ argu- ment does not always work in practice.)

Since the real secrecy is in the key, its length is a major design issue. Consider a simple combination lock. The general principle is that you enter digits in se- quence. Everyone knows this, but the key is secret. A key length of two digits means that there are 100 possibilities. A key length of three digits means 1000 possibilities, and a key length of six digits means a million. The longer the key, the higher the work factor the cryptanalyst has to deal with. The work factor for breaking the system by exhaustive search of the key space is exponential in the key length. Secrecy comes from having a strong (but public) algorithm and a long key. To prevent your kid brother from reading your email, perhaps even 64-bit keys will do. For routine commercial use, perhaps 256 bits should be used. To keep major governments at bay, much larger keys of at least 256 bits, and preferably more are needed. Incidentally, these numbers are for symmetric encryption, where the en- cryption and the decryption key are the same. We will discuss the differences be tween symmetric and asymmetric encryption later.

From the cryptanalyst’s point of view, the cryptanalysis problem has three principal variations. When he has a quantity of ciphertext and no plaintext, he is confronted with the ciphertext-only problem. The cryptograms that appear in the puzzle section of newspapers pose this kind of problem. When the cryptanalyst has some matched ciphertext and plaintext, the problem is called the known plain text problem. Finally, when the cryptanalyst has the ability to encrypt pieces of plaintext of his own choosing, we have the chosen plaintext problem. Newspaper cryptograms could be broken trivially if the cryptanalyst were allowed to ask such questions as ‘‘What is the encryption of ABCDEFGHIJKL?’’

Novices in the cryptography business often assume that if a cipher can with- stand a ciphertext-only attack, it is secure. This assumption is very naive. In many cases, the cryptanalyst can make a good guess at parts of the plaintext. For ex- ample, the first thing many computers say when you boot them up is ‘‘login:’’. Equipped with some matched plaintext-ciphertext pairs, the cryptanalyst’s job be- comes much easier. To achieve security, the cryptographer should be conservative and make sure that the system is unbreakable even if his opponent can encrypt arb itrary amounts of chosen plaintext.

Encryption methods have historically been divided into two categories: substi tution ciphers and transposition ciphers. We will now deal with each of these briefly as background information for modern cryptography.

8.4.2 Two Fundamental Cryptographic Principles

Although we will study many different cryptographic systems in the pages ahead, two principles underlying all of them are important to understand. Pay attention. You violate them at your peril.

770 NETWORK SECURITY CHAP. 8 Redundancy

The first principle is that all encrypted messages must contain some redun- dancy, that is, information not needed to understand the message. An example may make it clear why this is needed. Consider a mail-order company, The Couch Potato (TCP), with 60,000 products. Thinking they are being very efficient, TCP’s programmers decide that ordering messages should consist of a 16-byte customer name followed by a 3-byte data field (1 byte for the quantity and 2 bytes for the product number). The last 3 bytes are to be encrypted using a very long key known only by the customer and TCP.

At first, this might seem secure, and in a sense it is because passive intruders cannot decrypt the messages. Unfortunately, it also has a fatal flaw that renders it useless. Suppose that a recently fired employee wants to punish TCP for firing her. Just before leaving, she takes the customer list with her. She works through the night writing a program to generate fictitious orders using real customer names. Since she does not have the list of keys, she just puts random numbers in the last 3 bytes, and sends hundreds of orders off to TCP.

When these messages arrive, TCP’s computer uses the customers’ name to lo- cate the key and decrypt the message. Unfortunately for TCP, almost every 3-byte message is valid, so the computer begins printing out shipping instructions. While it might seem a bit odd for a customer to order 837 sets of children’s swings or 540 sandboxes, for all the computer knows, the customer might be planning to open a chain of franchised playgrounds. In this way, an active intruder (the ex-employee) can cause a massive amount of trouble, even though she cannot understand the messages her computer is generating.

This problem can be solved by the addition of redundancy to all messages. For example, if order messages are extended to 12 bytes, the first 9 of which must be zeros, this attack no longer works because the ex-employee can no longer generate a large stream of valid messages. The moral of the story is that all messages must contain considerable redundancy so that active intruders cannot send random junk and have it be interpreted as a valid message. Thus we have:

Cryptographic principle 1: Messages must contain some redundancy

However, adding redundancy makes it easier for cryptanalysts to break mes- sages. Suppose that the mail-order business is highly competitive, and The Couch Potato’s main competitor, The Sofa Tuber, would dearly love to know how many sandboxes TCP is selling, so it taps TCP’s phone line. In the original scheme with 3-byte messages, cryptanalysis was nearly impossible because after guessing a key, the cryptanalyst had no way of telling whether it was right because almost every message was technically legal. With the new 12-byte scheme, it is easy for the cryptanalyst to tell a valid message from an invalid one.

In other words, upon decrypting a message, the recipient must be able to tell whether it is valid by simply inspecting the message and perhaps performing a

SEC. 8.4 CRYPTOGRAPHY 771

simple computation. This redundancy is needed to prevent active intruders from sending garbage and tricking the receiver into decrypting the garbage and acting on the ‘‘plaintext.’’

However, this same redundancy makes it much easier for passive intruders to break the system, so there is some tension here. Furthermore, the redundancy should never be in the form of n 0s at the start or end of a message, since running such messages through some cryptographic algorithms gives more predictable re-

sults, making the cryptanalysts’ job easier. A CRC polynomial (see Chapter 3) is much better than a run of 0s since the receiver can easily verify it, but it generates more work for the cryptanalyst. Even better is to use a cryptographic hash, a con- cept we will explore later. For the moment, think of it as a better CRC.

Freshness

The second cryptographic principle is that measures must be taken to ensure that each message received can be verified as being fresh, that is, sent very recently. This measure is needed to prevent active intruders from playing back old messages. If no such measures were taken, our ex-employee could tap TCP’s phone line and just keep repeating previously sent valid messages. Thus,

Cryptographic principle 2: Some method is needed to foil replay attacks

One such measure is including in every message a timestamp valid only for, say, 60 seconds. The receiver can then just keep messages around for 60 seconds and compare newly arrived messages to previous ones to filter out duplicates. Mes- sages older than 60 seconds can be thrown out, since any replays sent more than 60 seconds later will be rejected as too old. The interval should not be too short (e.g., 5 seconds) because the sender’s and receiver’s clocks may be slightly out of sync. Measures other than timestamps will be discussed later.

8.4.3 Substitution Ciphers

In a substitution cipher, each letter or group of letters is replaced by another letter or group of letters to disguise it. One of the oldest known ciphers is the Cae- sar cipher, attributed to Julius Caesar. With this method, a becomes D, b becomes E, c becomes F, . . . , and z becomes C. For example, attack becomes DWWDFN. In our examples, plaintext will be given in lowercase letters, and ciphertext in uppercase letters.

A slight generalization of the Caesar cipher allows the ciphertext alphabet to be shifted by k letters, instead of always three. In this case, k becomes a key to the general method of circularly shifted alphabets. The Caesar cipher may have fooled Pompey, but it has not fooled anyone since.

The next improvement is to have each of the symbols in the plaintext, say, the 26 letters for simplicity, map onto some other letter. For example,

772 NETWORK SECURITY CHAP. 8

plaintext: ciphertext:

a b c d e f g h i j k l m n o p q r s t u v w x y z QW E R T Y U I O P A S D F G H J K L Z X C V B N M

The general system of symbol-for-symbol substitution is called a monoalphabetic substitution cipher, with the key being the 26-letter string corresponding to the full alphabet. For the key just given, the plaintext attack would be transformed into the ciphertext QZZQEA.

At first glance, this might appear to be a safe system because although the cryptanalyst knows the general system (letter-for-letter substitution), he does not know which of the 26! 5 4 × 1026 possible keys is in use. In contrast with the Cae- sar cipher, trying all of them is not a promising approach. Even at 1 nsec per solu tion, a million cores working in parallel would take 10,000 years to try all the keys.

Nevertheless, given a surprisingly small amount of ciphertext, the cipher can be broken easily. The basic attack takes advantage of the statistical properties of natural languages. In English, for example, e is the most common letter, followed by t, o, a, n, i, etc. The most common two-letter combinations, or digrams, are th, in, er, re, and an. The most common three-letter combinations, or trigrams, are the, ing, and, and ion.

A cryptanalyst trying to break a monoalphabetic cipher would start out by counting the relative frequencies of all letters in the ciphertext. Then he might ten tatively assign the most common one to e and the next most common one to t. He would then look at trigrams to find a common one of the form tXe, which strongly suggests that X is h. Similarly, if the pattern thYt occurs frequently, the Y probably stands for a. With this information, he can look for a frequently occurring trigram of the form aZW, which is most likely and. By making guesses at common letters, digrams, and trigrams and knowing about likely patterns of vowels and consonants, the cryptanalyst builds up a tentative plaintext, letter by letter.

Another approach is to guess a probable word or phrase. For example, consid- er the following ciphertext from an accounting firm (blocked into groups of five characters):

CTBMN BYCTC BTJDS QXBNS GSTJC BTSWX CTQTZ CQVUJ QJSGS TJQZZ MNQJS VLNSX VSZJU JDSTS JQUUS JUBXJ DSKSU JSNTK BGAQJ ZBGYQ TLCTZ BNYBN QJSW

A likely word in a message from an accounting firm is financial. Using our know ledge that financial has a repeated letter (i), with four other letters between their occurrences, we look for repeated letters in the ciphertext at this spacing. We find 12 hits, at positions 6, 15, 27, 31, 42, 48, 56, 66, 70, 71, 76, and 82. However, only two of these, 31 and 42, have the next letter (corresponding to n in the plaintext) repeated in the proper place. Of these two, only 31 also has the a correctly posi tioned, so we know that financial begins at position 30. From this point on, deduc ing the key is easy by using the frequency statistics for English text and looking for nearly complete words to finish off.

SEC. 8.4 CRYPTOGRAPHY 773 8.4.4 Transposition Ciphers

Substitution ciphers preserve the order of the plaintext symbols but disguise them. Transposition ciphers, in contrast, reorder the letters but do not disguise them. Figure 8-10 depicts a common transposition cipher, the columnar transposi tion. The cipher is keyed by a word or phrase not containing any repeated letters. In this example, MEGABUCK is the key. The purpose of the key is to order the columns, with column 1 being under the key letter closest to the start of the alpha- bet, and so on. The plaintext is written horizontally, in rows, padded to fill the ma trix if need be. The ciphertext is read out by columns, starting with the column whose key letter is the lowest.

M E G A B U C K

7 4 5 1 2 8 3 6

p l e a s e t r Plaintext

a n s f e r o n e m i l l i o n d o l l a r s t o m y s w i s s b a n k a c c o u n t s i x t w o t w o a b c d

pleasetransferonemilliondollarsto

myswissbankaccountsixtwotwo

Ciphertext

AFLLSKSOSELAWAIATOOSSCTCLNMOMANT ESILYNTWRNNTSOWDPAEDOBUOERIRICXB

Figure 8-10. A transposition cipher.

To break a transposition cipher, the cryptanalyst must first be aware that he is dealing with a transposition cipher. By looking at the frequency of E, T, A, O, I, N, etc., it is easy to see if they fit the normal pattern for plaintext. If so, the cipher is clearly a transposition cipher because in such a cipher every letter represents itself, keeping the frequency distribution intact.

The next step is to make a guess at the number of columns. In many cases, a probable word or phrase may be guessed at from the context. For example, sup- pose that our cryptanalyst suspects that the plaintext phrase milliondollars occurs somewhere in the message. Observe that digrams MO, IL, LL, LA, IR, and OS oc- cur in the ciphertext as a result of this phrase wrapping around. The ciphertext let ter O follows the ciphertext letter M (i.e., they are vertically adjacent in column 4) because they are separated in the probable phrase by a distance equal to the key length. If a key of length seven had been used, the digrams MD, IO, LL, LL, IA, OR, and NS would have occurred instead. In fact, for each key length, a different set of digrams is produced in the ciphertext. By hunting for the various possibili ties, the cryptanalyst can often easily determine the key length.

774 NETWORK SECURITY CHAP. 8

The remaining step is to order the columns. When the number of columns, k, is small, each of the k(k < 1) column pairs can be examined in turn to see if its digram frequencies match those for English plaintext. The pair with the best match is assumed to be correctly positioned. Now each of the remaining columns is ten tatively tried as the successor to this pair. The column whose digram and trigram frequencies give the best match is tentatively assumed to be correct. The next col- umn is found in the same way. The entire process is continued until a potential ordering is found. Chances are that the plaintext will be recognizable at this point (e.g., if milloin occurs, it is clear what the error is).

Some transposition ciphers accept a fixed-length block of input and produce a fixed-length block of output. These ciphers can be completely described by giving a list telling the order in which the characters are to be output. For example, the cipher of Fig. 8-10 can be seen as a 64 character block cipher. Its output is 4, 12, 20, 28, 36, 44, 52, 60, 5, 13, . . . , 62. In other words, the fourth input character, a, is the first to be output, followed by the twelfth, f, and so on.

8.4.5 One-Time Pads

Constructing an unbreakable cipher is actually quite easy; the technique has been known for decades. First, choose a random bit string as the key. Then con- vert the plaintext into a bit string, for example, by using its ASCII representation. Finally, compute the XOR (eXclusive OR) of these two strings, bit by bit. The re- sulting ciphertext cannot be broken because in a sufficiently large sample of ciphertext, each letter will occur equally often, as will every digram, every trigram, and so on. This method, known as the one-time pad, is immune to all present and future attacks, no matter how much computational power the intruder has. The reason derives from information theory: there is simply no information in the mes- sage because all possible plaintexts of the given length are equally likely.

An example of how one-time pads are used is given in Fig. 8-11. First, mes- sage 1, ‘‘I love you.’’ is converted to 7-bit ASCII. Then a one-time pad, pad 1, is chosen and XORed with the message to get the ciphertext. A cryptanalyst could try all possible one-time pads to see what plaintext came out for each one. For ex- ample, the one-time pad listed as pad 2 in the figure could be tried, resulting in plaintext 2, ‘‘Elvis lives,’’ which may or may not be plausible (a subject beyond the scope of this book). In fact, for every 11-character ASCII plaintext, there is a one time pad that generates it. That is what we mean by saying there is no information in the ciphertext: you can get any message of the correct length out of it.

One-time pads are great in theory, but have a number of disadvantages in prac tice. To start with, the key cannot be memorized, so both sender and receiver must carry a written copy with them. If either one is subject to capture, written keys are clearly undesirable. Additionally, the total amount of data that can be transmitted is limited by the amount of key available. If the spy strikes it rich and discovers a wealth of data, he may find himself unable to transmit them back to headquarters

SEC. 8.4 CRYPTOGRAPHY 775

Message 1: 1001001 0100000 1101100 1101111 1110110 1100101 0100000 1111001 1101111 1110101 0101110 Pad 1: 1010010 1001011 1110010 1010101 1010010 1100011 0001011 0101010 1010111 1100110 0101011 Ciphertext: 0011011 1101011 0011110 0111010 0100100 0000110 0101011 1010011 0111000 0010011 0000101

Pad 2: 1011110 0000111 1101000 1010011 1010111 0100110 1000111 0111010 1001110 1110110 1110110 Plaintext 2: 1000101 1101100 1110110 1101001 1110011 0100000 1101100 1101001 1110110 1100101 1110011

Figure 8-11. The use of a one-time pad for encryption and the possibility of get

ting any possible plaintext from the ciphertext by the use of some other pad.

because the key has been used up. Another problem is the sensitivity of the meth- od to lost or inserted characters. If the sender and receiver get out of synchroniza tion, all data from then on will appear garbled.

With the advent of computers, the one-time pad might potentially become practical for some applications. The source of the key could be a special DVD that contains several gigabytes of information and, if transported in a DVD movie box and prefixed by a few minutes of video, would not even be suspicious. Of course, at gigabit network speeds, having to insert a new DVD every 30 sec could become tedious. And the DVDs must be personally carried from the sender to the receiver before any messages can be sent, which greatly reduces their practical utility. Also, given that very soon nobody will use DVD or Blu-Ray discs any more, any- one caught carrying around a box of them should perhaps be regarded with suspi- cion.

Quantum Cryptography

Interestingly, there may be a solution to the problem of how to transmit the one-time pad over the network, and it comes from a very unlikely source: quantum mechanics. This area is still experimental, but initial tests are promising. If it can be perfected and be made efficient, virtually all cryptography will eventually be done using one-time pads since they are provably secure. Below we will briefly explain how this method, quantum cryptography, works. In particular, we will describe a protocol called BB84 after its authors and publication year (Bennet and Brassard, 1984).

Suppose that a user, Alice, wants to establish a one-time pad with a second user, Bob. Alice and Bob are called principals, the main characters in our story. For example, Bob is a banker with whom Alice would like to do business. The names ‘‘Alice’’ and ‘‘Bob’’ have been used for the principals in virtually every paper and book on cryptography since Ron Rivest introduced them many years ago (Rivest et al., 1978). Cryptographers love tradition. If we were to use ‘‘Andy’’ and ‘‘Barbara’’ as the principals, no one would believe anything in this chapter. So be it.

If Alice and Bob could establish a one-time pad, they could use it to communi- cate securely. The obvious question is: how can they establish it without having

776 NETWORK SECURITY CHAP. 8

previously exchanging them physically (using DVDs, books, or USB sticks)? We can assume that Alice and Bob are at the opposite ends of an optical fiber over which they can send and receive light pulses. However, an intrepid intruder, Trudy, can cut the fiber to splice in an active tap. Trudy can read all the bits sent in both directions. She can also send false messages in both directions. The situation might seem hopeless for Alice and Bob, but quantum cryptography can shed some new light on the subject.

Quantum cryptography is based on the fact that light comes in microscopic lit tle packets called photons, which have some peculiar properties. Furthermore, light can be polarized by being passed through a polarizing filter, a fact well known to both sunglasses wearers and photographers. If a beam of light (i.e., a stream of photons) is passed through a polarizing filter, all the photons emerging from it will be polarized in the direction of the filter’s axis (e.g., vertically). If the beam is now passed through a second polarizing filter, the intensity of the light emerging from the second filter is proportional to the square of the cosine of the angle between the axes. If the two axes are perpendicular, no photons get through. The absolute ori- entation of the two filters does not matter; only the angle between their axes counts.

To generate a one-time pad, Alice needs two sets of polarizing filters. Set one consists of a vertical filter and a horizontal filter. This choice is called a rectilin- ear basis. A basis (plural: bases) is just a coordinate system. The second set of filters is the same, except rotated 45 degrees, so one filter runs from the lower left to the upper right and the other filter runs from the upper left to the lower right. This choice is called a diagonal basis. Thus, Alice has two bases, which she can rapidly insert into her beam at will. In reality, Alice does not have four separate filters, but a crystal whose polarization can be switched electrically to any of the four allowed directions at great speed. Bob has the same equipment as Alice. The fact that Alice and Bob each have two bases available is essential to quantum cryptography.

For each basis, Alice now assigns one direction as 0 and the other as 1. In the example presented below, we assume she chooses vertical to be 0 and horizontal to be 1. Independently, she also chooses lower left to upper right as 0 and upper left to lower right as 1. She sends these choices to Bob as plaintext, fully aware that Trudy will be able to read her message.

Now Alice picks a one-time pad, for example, based on a random number gen- erator (a complex subject all by itself). She transfers it bit by bit to Bob, choosing one of her two bases at random for each bit. To send a bit, her photon gun emits one photon polarized appropriately for the basis she is using for that bit. For ex- ample, she might choose bases of diagonal, rectilinear, rectilinear, diagonal, recti linear, etc. To send her one-time pad of 1001110010100110 with these bases, she would send the photons shown in Fig. 8-12(a). Given the one-time pad and the se- quence of bases, the polarization to use for each bit is uniquely determined. Bits sent one photon at a time are called qubits.

SEC. 8.4 CRYPTOGRAPHY 777

Bit

number

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Data

1 0 0 1 1 1 0 0 1 0 1 0 0 1 1 0 What

(a) (b) (c) (d)

Alice sends

Bob's bases

What

Bob gets

Correct

No Yes No Yes No No No Yes Yes No Yes Yes Yes No Yes No basis?

One

(e) (f)

0 1 0 1 1 0 0 1

time pad

Trudy's bases

(g) x 0 x 1 x x x ? 1 x ? ? 0 x ?

Trudy's

x

pad

Figure 8-12. An example of quantum cryptography.

Bob does not know which bases to use, so he picks one at random for each ar riving photon and just uses it, as shown in Fig. 8-12(b). If he picks the correct basis, he gets the correct bit. If he picks the incorrect basis, he gets a random bit because if a photon hits a filter polarized at 45 degrees to its own polarization, it randomly jumps to the polarization of the filter or to a polarization perpendicular to the filter, with equal probability. This property of photons is fundamental to quantum mechanics. Thus, some of the bits are correct and some are random, but Bob does not know which are which. Bob’s results are depicted in Fig. 8-12(c).

How does Bob find out which bases he got right and which he got wrong? He simply tells Alice (in plaintext) which basis he used for each bit in plaintext and she tells him which are right and which are wrong in plaintext, as shown in Fig. 8-12(d). From this information, both of them can build a bit string from the correct guesses, as shown in Fig. 8-12(e). On the average, this bit string will be half the length of the original bit string, but since both parties know it, they can use it as a one-time pad. All Alice has to do is transmit a bit string slightly more than twice the desired length, and she and Bob will have a one-time pad of the desired length. Done.

But wait a minute. We forgot Trudy for the moment. Suppose that she is curi- ous about what Alice has to say and cuts the fiber, inserting her own detector and

778 NETWORK SECURITY CHAP. 8

transmitter. Unfortunately for her, she does not know which basis to use for each photon either. The best she can do is pick one at random for each photon, just as Bob does. An example of her choices is shown in Fig. 8-12(f). When Bob later reports (in plaintext) which bases he used and Alice tells him (in plaintext) which ones are correct, Trudy now knows when she got it right and when she got it wrong. In Fig. 8-12, she got it right for bits 0, 1, 2, 3, 4, 6, 8, 12, and 13. But she knows from Alice’s reply in Fig. 8-12(d) that only bits 1, 3, 7, 8, 10, 11, 12, and 14 are part of the one-time pad. For four of these bits (1, 3, 8, and 12), she guessed right and captured the correct bit. For the other four (7, 10, 11, and 14), she guessed wrong and does not know the bit transmitted. Thus, Bob knows the one time pad starts with 01011001, from Fig. 8-12(e) but all Trudy has is 01?1??0?, from Fig. 8-12(g).

Of course, Alice and Bob are aware that Trudy may have captured part of their one-time pad, so they would like to reduce the information Trudy has. They can do this by performing a transformation on it. For example, they could divide the one-time pad into blocks of 1024 bits, square each one to form a 2048-bit number, and use the concatenation of these 2048-bit numbers as the one-time pad. With her partial knowledge of the bit string transmitted, Trudy has no way to generate its square and so has nothing. The transformation from the original one-time pad to a different one that reduces Trudy’s knowledge is called privacy amplification. In practice, complex transformations in which every output bit depends on every input bit are used instead of squaring.

Poor Trudy. Not only does she have no idea what the one-time pad is, but her presence is not a secret either. After all, she must relay each received bit to Bob to trick him into thinking he is talking to Alice. The trouble is, the best she can do is transmit the qubit she received, using the polarization she used to receive it, and about half the time she will be wrong, causing many errors in Bob’s one-time pad.

When Alice finally starts sending data, she encodes it using a heavy for- ward-error-correcting code. From Bob’s point of view, a 1-bit error in the one-time pad is the same as a 1-bit transmission error. Either way, he gets the wrong bit. If there is enough forward error correction, he can recover the original message despite all the errors, but he can easily count how many errors were corrected. If this number is far more than the expected error rate of the equipment, he knows that Trudy has tapped the line and can act accordingly (e.g., tell Alice to switch to a radio channel, call the police, etc.). If Trudy had a way to clone a photon so she had one photon to inspect and an identical photon to send to Bob, she could avoid detection, but at present no way to clone a photon perfectly is known. And even if Trudy could clone photons, the value of quantum cryptography to establish one time pads would not be reduced.

Although quantum cryptography has been shown to operate over distances of 60 km of fiber, the equipment is complex and expensive. Still, the idea has promise if it can be made to scale up and become cheaper. For more information about quantum cryptography, see Clancy et al. (2019).

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 779 8.5 SYMMETRIC-KEY ALGORITHMS

Modern cryptography uses the same basic ideas as traditional cryptography (transposition and substitution), but its emphasis is different. Traditionally, cryp tographers have used simple algorithms. Nowadays, the reverse is true: the object is to make the encryption algorithm so complex and involuted that even if the cryptanalyst acquires vast mounds of enciphered text of his own choosing, he will not be able to make any sense of it at all without the key.

The first class of encryption algorithms we will study in this chapter are called symmetric-key algorithms because they use the same key for encryption and de- cryption. Fig. 8-9 illustrates the use of a symmetric-key algorithm. In particular, we will focus on block ciphers, which take an n-bit block of plaintext as input and transform it using the key into an n-bit block of ciphertext.

Cryptographic algorithms can be implemented in either hardware (for speed) or software (for flexibility). Although most of our treatment concerns the algo rithms and protocols, which are independent of the actual implementation, a few words about building cryptographic hardware may be of interest. Transpositions and substitutions can be implemented with simple electrical circuits. Figure 8-13(a) shows a device, known as a P-box (P stands for permutation), used to ef fect a transposition on an 8-bit input. If the 8 bits are designated from top to bot tom as 01234567, the output of this particular P-box is 36071245. By appropriate internal wiring, a P-box can be made to perform any transposition and do it at prac tically the speed of light since no computation is involved, just signal propagation. This design follows Kerckhoffs’ principle: the attacker knows that the general method is permuting the bits. What he does not know is which bit goes where.

P-box

P-box

Product cipher

8

o

3

o

S1

S5

S9

S2 P1 P2 P3 P4 S3

t

3

:

r

e

d

o

c

e

D

(a)

t

8

:

r

e

d

o

c

n

E

(b)

S4

S6 S7 S8 (c)

S10 S11 S12

Figure 8-13. Basic elements of product ciphers. (a) P-box. (b) S-box. (c) Product.

Substitutions are performed by S-boxes, as shown in Fig. 8-13(b). In this ex- ample, a 3-bit plaintext is entered and a 3-bit ciphertext is output. The 3-bit input selects one of the eight lines exiting from the first stage and sets it to 1; all the other lines are 0. The second stage is a P-box. The third stage encodes the selec ted input line in binary again. With the wiring shown, if the eight octal numbers 01234567 were input one after another, the output sequence would be 24506713.

780 NETWORK SECURITY CHAP. 8

In other words, 0 has been replaced by 2, 1 has been replaced by 4, etc. Again, by appropriate wiring of the P-box inside the S-box, any substitution can be accom- plished. Furthermore, such a device can be built in hardware to achieve great speed, since encoders and decoders have only one or two (subnanosecond) gate delays and the propagation time across the P-box may well be less than 1 picosec.

The real power of these basic elements only becomes apparent when we cas- cade a whole series of boxes to form a product cipher, as shown in Fig. 8-13(c). In this example, 12 input lines are transposed (i.e., permuted) by the first stage (P1). In the second stage, the input is broken up into four groups of 3 bits, each of which is substituted independently of the others (S1to S4). This arrangement shows a method of approximating a larger S-box from multiple, smaller S-boxes. It is useful because small S-boxes are practical for a hardware implementation (e.g., an 8-bit S-box can be realized as a 256-entry lookup table), but large S-boxes become quite unwieldy to build (e.g., a 12-bit S-box would at a minimum need 212 = 4096 crossed wires in its middle stage). Although this method is less gener- al, it is still powerful. By including a sufficiently large number of stages in the product cipher, the output can be made to be an exceedingly complicated function of the input.

Product ciphers that operate on k-bit inputs to produce k-bit outputs are com- mon. One common value for k is 256. A hardware implementation usually has at least 20 physical stages, instead of just 7 as in Fig. 8-13(c). A software imple- mentation has a loop with at least eight iterations, each one performing S-box-type substitutions on subblocks of the 64- to 256-bit data block, followed by a permuta tion that mixes the outputs of the S-boxes. Often there is a special initial permuta tion and one at the end as well. In the literature, the iterations are called rounds.

8.5.1 The Data Encryption Standard

In January 1977 the U.S. Government adopted a product cipher developed by IBM as its official standard for unclassified information. This cipher, DES (Data Encryption Standard), was widely adopted by the industry for use in security products. It is no longer secure in its original form, but in a modified form it is still

used here and there. The original version was controversial because IBM specified a 128-bit key but after discussions with NSA, IBM ‘‘voluntarily’’ decided to re- duce the key length to 56 bits, which cryptographers at the time said was too small.

DES operates essentially as shown in Fig. 8-13(c), but on bigger units. The plaintext (in binary) is broken up into 64-bit units, and each one is encrypted sepa rately by doing permutations and substitutions parametrized by the 56-bit key on each of 16 consecutive rounds. In effect, it is a gigantic monoalphabetic substitu tion cipher on an alphabet with 64-bit characters (about which more shortly).

As early as 1979, IBM realized that 56 bits was much too short and devised a backward compatible scheme to increase the key length by having two 56-bit keys

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 781

used at once, for a total of 112 bits worth of key (Tuchman, 1979). The new scheme, called Triple DES is still in use and works like this.

K1

K2

K1

K1

K2

K1

P E C

E

D

C D P

D

E

(a) (b)

Figure 8-14. (a) Triple encryption using DES. (b) Decryption.

Obvious questions are: (1) Why two keys instead of three? and (2) Why en- cryption-decryption-encryption? The answer to both is that if a computer that uses triple DES has to talk to one that uses only single DES, it can set both keys to the same value and then apply triple DES to give the same result as single DES. This design made it easier to phase in triple DES. It is basically obsolete now, but still in use in some change-resistant applications.

8.5.2 The Advanced Encryption Standard

As DES began approaching the end of its useful life, even with triple DES, NIST (National Institute of Standards and Technology), the agency of the U.S. Dept. of Commerce charged with approving standards for the U.S. Federal Govern- ment, decided that the government needed a new cryptographic standard for unclassified use. NIST was keenly aware of all the controversy surrounding DES and well knew that if it just announced a new standard, everyone knowing anything about cryptography would automatically assume that NSA had built a back door into it so NSA could read everything encrypted with it. Under these conditions, probably no one would use the standard and it would have died quietly.

So, NIST took a surprisingly different approach for a government bureaucracy: it sponsored a cryptographic bake-off (contest). In January 1997, researchers from all over the world were invited to submit proposals for a new standard, to be called AES (Advanced Encryption Standard). The bake-off rules were:

1. The algorithm must be a symmetric block cipher.

2. The full design must be public.

3. Key lengths of 128, 192, and 256 bits must be supported.

4. Both software and hardware implementations must be possible.

5. The algorithm must be public or licensed on nondiscriminatory terms.

Fifteen serious proposals were made, and public conferences were organized in which they were presented and attendees were actively encouraged to find flaws in

782 NETWORK SECURITY CHAP. 8

all of them. In August 1998, NIST selected five finalists, primarily on the basis of their security, efficiency, simplicity, flexibility, and memory requirements (impor tant for embedded systems). More conferences were held and more potshots taken at the contestants.

In October 2000, NIST announced that it had selected Rijndael, invented by Joan Daemen and Vincent Rijmen. The name Rijndael, pronounced Rhine-doll (more or less), is derived from the last names of the authors: Rijmen + Daemen. In November 2001, Rijndael became the AES U.S. Government standard, published as FIPS (Federal Information Processing Standard) 197. Owing to the extraordin- ary openness of the competition, the technical properties of Rijndael, and the fact that the winning team consisted of two young Belgian cryptographers (who were unlikely to have built in a back door just to please NSA), Rijndael has become the world’s dominant cryptographic cipher. AES encryption and decryption is now part of the instruction set for some CPUs.

Rijndael supports key lengths and block sizes from 128 bits to 256 bits in steps of 32 bits. The key length and block length may be chosen independently. How- ever, AES specifies that the block size must be 128 bits and the key length must be 128, 192, or 256 bits. It is doubtful that anyone will ever use 192-bit keys, so de facto, AES has two variants: a 128-bit block with a 128-bit key and a 128-bit block with a 256-bit key.

In our treatment of the algorithm, we will examine only the 128/128 case be- cause this is the commercial norm. A 128-bit key gives a key space of 2128 5 3 × 1038 keys. Even if NSA manages to build a machine with 1 billion par- allel processors, each being able to evaluate one key per picosecond, it would take

such a machine about 1010 years to search the key space. By then the sun will have burned out, so the folks then present will have to read the results by candlelight.

Rijndael

From a mathematical perspective, Rijndael is based on Galois field theory, which gives it some provable security properties. However, it can also be viewed as C code, without getting into the mathematics.

Like DES, Rijndael uses both substitution and permutations, and it also uses multiple rounds. The number of rounds depends on the key size and block size, being 10 for 128-bit keys with 128-bit blocks and moving up to 14 for the largest key or the largest block. However, unlike DES, all operations involve an integral number of bytes, to allow for efficient implementations in both hardware and soft- ware. DES is bit oriented and software implementations are slow as a result.

The algorithm has been designed not only for great security, but also for great speed. A good software implementation on a 2-GHz machine should be able to achieve an encryption rate of 700 Mbps, which is fast enough to encrypt over a dozen 4K videos in real time. Hardware implementations are faster still.

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 783 8.5.3 Cipher Modes

Despite all this complexity, AES (or DES, or any block cipher for that matter) is basically a monoalphabetic substitution cipher using big characters (128-bit characters for AES and 64-bit characters for DES). Whenever the same plaintext block goes in the front end, the same ciphertext block comes out the back end. If you encrypt the plaintext abcdefgh 100 times with the same DES or AES key, you get the same ciphertext 100 times. An intruder can exploit this property to help subvert the cipher.

Electronic Code Book Mode

To see how this monoalphabetic substitution cipher property can be used to partially defeat the cipher, we will use (triple) DES because it is easier to depict 64-bit blocks than 128-bit blocks, but AES has exactly the same problem. The straightforward way to use DES to encrypt a long piece of plaintext is to break it up into consecutive 8-byte (64-bit) blocks and encrypt them one after another with the same key. The last piece of plaintext is padded out to 64 bits, if need be. This technique is known as ECB mode (Electronic Code Book mode) in analogy with old-fashioned code books where each plaintext word was listed, followed by its ciphertext (usually a five-digit decimal number).

In Fig. 8-15, we have the start of a computer file listing the annual bonuses a company has decided to award to its employees. This file consists of consecutive 32-byte records, one per employee, in the format shown: 16 bytes for the name, 8 bytes for the position, and 8 bytes for the bonus. Each of the sixteen 8-byte blocks (numbered from 0 to 15) is encrypted by (triple) DES.

Name Position Bonus

A d a m s , L e s l i e C l e r k $ 1 0 B l a c k , R o b i n B o s s $ 5 0 0 , 0 0 0 C o l l i n s , K i m M a n a g e r $ 1 0 0 , 0 0 0

D a v i s , B o b b i e J a n i t o r $ 5 Bytes 16 8 8 Figure 8-15. The plaintext of a file encrypted as 16 DES blocks.

Leslie just had a fight with the boss and is not expecting much of a bonus. Kim, in contrast, is the boss’ favorite, and everyone knows this. Leslie can get ac- cess to the file after it is encrypted but before it is sent to the bank. Can Leslie rec tify this unfair situation, given only the encrypted file?

No problem at all. All Leslie has to do is make a copy of the 12th ciphertext block (which contains Kim’s bonus) and use it to replace the fourth ciphertext

784 NETWORK SECURITY CHAP. 8

block (which contains Leslie’s bonus). Even without knowing what the 12th block says, Leslie can expect to have a much merrier Christmas this year. (Copying the eighth ciphertext block is also a possibility, but is more likely to be detected; besides, Leslie is not a greedy person.)

Cipher Block Chaining Mode

To thwart this type of attack, all block ciphers can be chained in various ways so that replacing a block the way Leslie did will cause the plaintext decrypted start ing at the replaced block to be garbage. One method to do so is cipher block chaining. In this method, shown in Fig. 8-16, each plaintext block is XORed with the previous ciphertext block before being encrypted. Consequently, the same plaintext block no longer maps onto the same ciphertext block, and the encryption is no longer a big monoalphabetic substitution cipher. The first block is XORed with a randomly chosen IV (Initialization Vector), which is transmitted (in plain text) along with the ciphertext.

P0

P1

P2

P3

C0

C1

C2

C3

+

+

IV

+ +

Key

D

D

D

D

Encryption box

Decryption box

Key

E

E

E

E

+ + + + IV

Exclusive

C0

C1

C2

C3

P0

P1

P2

OR

P3

(a) (b)

Figure 8-16. Cipher block chaining. (a) Encryption. (b) Decryption.

We can see how cipher block chaining mode works by examining the example of Fig. 8-16. We start out by computing C0 = E(P0 XOR IV ). Then we compute C1 = E(P1 XOR C0), and so on. Decryption also uses XOR to reverse the process, with P0 = IV XOR D(C0), and so on. Note that the encryption of block i is a function of all the plaintext in blocks 0 through i < 1, so the same plaintext gener-

ates different ciphertext depending on where it occurs. A transformation of the type Leslie made will result in nonsense for two blocks starting at Leslie’s bonus field. To an astute security officer, this peculiarity might suggest where to start the ensuing investigation.

Cipher block chaining also has the advantage that the same plaintext block will not result in the same ciphertext block, making cryptanalysis more difficult. In fact, this is the main reason it is used.

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 785 Cipher Feedback Mode

However, cipher block chaining has the disadvantage of requiring an entire 64-bit block to arrive before decryption can begin. For byte-by-byte encryption, cipher feedback mode using (triple) DES is used, as shown in Fig. 8-17. For AES, the idea is exactly the same, only a 128-bit shift register is used. In this fig- ure, the state of the encryption machine is shown after bytes 0 through 9 have been encrypted and sent. When plaintext byte 10 arrives, as illustrated in Fig. 8-17(a), the DES algorithm operates on the 64-bit shift register to generate a 64-bit cipher text. The leftmost byte of that ciphertext is extracted and XORed with P10. That byte is transmitted on the transmission line. In addition, the shift register is shifted left 8 bits, causing C2to fall off the left end, and C10is inserted in the position just vacated at the right end by C9.

64-bit shift register

C2 C3 C4 C5 C6 C7 C8 C9

64-bit shift register

C2 C3 C4 C5 C6 C7 C8 C9

Key

Encryption E

box

Key

Encryption E

box

Select

C10 C10 Select

leftmost byte

+

P10 C10 Exclusive OR

(a)

leftmost byte

+

C10 P10 (b)

Figure 8-17. Cipher feedback mode. (a) Encryption. (b) Decryption.

Note that the contents of the shift register depend on the entire previous history of the plaintext, so a pattern that repeats multiple times in the plaintext will be en- crypted differently each time in the ciphertext. As with cipher block chaining, an initialization vector is needed to start the ball rolling.

Decryption with cipher feedback mode works the same way as encryption. In particular, the content of the shift register is encrypted, not decrypted, so the selec ted byte that is XORed with C10to get P10is the same one that was XORed with P10to generate C10in the first place. As long as the two shift registers remain identical, decryption works correctly. This is illustrated in Fig. 8-17(b).

A problem with cipher feedback mode is that if one bit of the ciphertext is ac- cidentally inverted during transmission, the 8 bytes that are decrypted while the bad byte is in the shift register will be corrupted. Once the bad byte is pushed out of the shift register, correct plaintext will once again be generated thereafter. Thus,

786 NETWORK SECURITY CHAP. 8

the effects of a single inverted bit are relatively localized and do not ruin the rest of the message, but they do ruin as many bits as the shift register is wide.

Stream Cipher Mode

Nevertheless, applications exist in which having a 1-bit transmission error mess up 64 bits of plaintext is too large an effect. For these applications, a fourth option, stream cipher mode, exists. It works by encrypting an initialization vector (IV), using a key to get an output block. The output block is then encrypted, using the key to get a second output block. This block is then encrypted to get a third block, and so on. The (arbitrarily large) sequence of output blocks, called the keystream, is treated like a one-time pad and XORed with the plaintext to get the ciphertext, as shown in Fig. 8-18(a). Note that the IV is used only on the first step. After that, the output is encrypted. Also, note that the keystream is independent of the data, so it can be computed in advance, if need be, and is completely insensitive to transmission errors. Decryption is shown in Fig. 8-18(b).

Key

IV

Encryption box E

Keystream +

Key

IV

Encryption box E

Keystream +

Plaintext Ciphertext (a)

Ciphertext Plaintext (b)

Figure 8-18. A stream cipher. (a) Encryption. (b) Decryption.

Decryption occurs by generating the same keystream at the receiving side. Since the keystream depends only on the IV and the key, it is not affected by trans- mission errors in the ciphertext. Thus, a 1-bit error in the transmitted ciphertext generates only a 1-bit error in the decrypted plaintext.

It is essential never to use the same (key, IV) pair twice with a stream cipher because doing so will generate the same keystream each time. Using the same keystream twice exposes the ciphertext to a keystream reuse attack. Imagine that the plaintext block, P0, is encrypted with the keystream to get P0 XOR K0. Later, a second plaintext block, Q0, is encrypted with the same keystream to get Q0 XOR K0. An intruder who captures both of these ciphertext blocks can simply XOR them together to get P0 XOR Q0, which eliminates the key. The intruder now has the XOR of the two plaintext blocks. If one of them is known or can be reasonably guessed, the other can also be found. In any event, the XOR of two plaintext streams can be attacked by using statistical properties of the message.

SEC. 8.5 SYMMETRIC-KEY ALGORITHMS 787

For example, for English text, the most common character in the stream will proba- bly be the XOR of two spaces, followed by the XOR of space and the letter ‘‘e’’ and so on. In short, equipped with the XOR of two plaintexts, the cryptanalyst has an excellent chance of deducing both of them.

8.6 PUBLIC-KEY ALGORITHMS

Historically, distributing the keys has always been the weakest link in most cryptosystems. No matter how strong a cryptosystem was, if an intruder could steal the key, the system was worthless. Cryptologists always took for granted that the encryption key and decryption key were the same (or easily derived from one another). But the key had to be distributed to all users of the system. Thus, it seemed as if there was an inherent problem. Keys had to be protected from theft, but they also had to be distributed, so they could not be locked in a bank vault.

In 1976, two researchers at Stanford University, Diffie and Hellman (1976), proposed a radically new kind of cryptosystem, one in which the encryption and decryption keys were so different that the decryption key could not feasibly be derived from the encryption key. In their proposal, the (keyed) encryption algo rithm, E, and the (keyed) decryption algorithm, D, had to meet three requirements. These requirements can be stated simply as follows:

1. D(E(P)) = P.

2. It is exceedingly difficult to deduce D from E.

3. E cannot be broken by a chosen plaintext attack.

The first requirement says that if we apply D to an encrypted message, E(P), we get the original plaintext message, P, back. Without this property, the legitimate receiver could not decrypt the ciphertext. The second requirement speaks for itself. The third requirement is needed because, as we shall see in a moment, intruders may experiment with the algorithm to their hearts’ content. Under these condi tions, there is no reason that the encryption key cannot be made public.

The method works like this. A person, say, Alice, who wants to receive secret messages, first devises two algorithms meeting the above requirements. The en- cryption algorithm and Alice’s key are then made public, hence the name public- key cryptography. Alice might put her public key on her home page on the Web, for example. We will use the notation EAto mean the encryption algorithm parametrized by Alice’s public key. Similarly, the (secret) decryption algorithm parameterized by Alice’s private key is DA. Bob does the same thing, publicizing EB but keeping DBsecret. Now let us see if we can solve the problem of establishing a secure channel be tween Alice and Bob, who have never had any previous contact. Both Alice’s en- cryption key, EA, and Bob’s encryption key, EB, are assumed to be in publicly

788 NETWORK SECURITY CHAP. 8

readable files. Now Alice takes her first message, P, computes EB (P), and sends it to Bob. Bob then decrypts it by applying his secret key DB[i.e., he computes DB(EB(P)) = P]. No one else can read the encrypted message, EB(P), because the encryption system is assumed to be strong and because it is too difficult to derive DBfrom the publicly known EB. To send a reply, R, Bob transmits EA(R). Alice and Bob can now communicate securely.

A note on terminology is perhaps useful here. Public-key cryptography re- quires each user to have two keys: a public key, used by the entire world for en- crypting messages to be sent to that user, and a private key, which the user needs for decrypting messages. We will consistently refer to these keys as the public and private keys, respectively, and distinguish them from the secret keys used for con- ventional symmetric-key cryptography.

8.6.1 RSA

The only catch is that we need to find algorithms that indeed satisfy all three requirements. Due to the potential advantages of public-key cryptography, many researchers are hard at work, and some algorithms have already been published. One good method was discovered by a group at M.I.T. (Rivest et al., 1978). It is known by the initials of the three discoverers (Rivest, Shamir, Adleman): RSA. It has survived all attempts to break it for more than 40 years and is considered very strong. Much practical security is based on it. For this reason, Rivest, Shamir, and Adleman were given the 2002 ACM Turing Award. Its major disadvantage is that it requires keys of at least 2048 bits for good security (versus 256 bits for symmet ric-key algorithms), which makes it quite slow.

The RSA method is based on some principles from number theory. We will now summarize how to use the method; for details, consult their paper.

1. Choose two large primes, p and q (say, 1024 bits).

2. n = p × q and z = (p < 1) × (q < 1).

3. Choose a number relatively prime to z and call it d.

4. Find e such that e × d = 1 mod z.

With these parameters computed in advance, we are ready to begin encryption. Divide the plaintext (regarded as a bit string) into blocks, so that each plaintext message, P, falls in the interval 0 ) P < n. Do that by grouping the plaintext into k < n is true.

blocks of k bits, where k is the largest integer for which 2

To encrypt a message, P, compute C = Pe (mod n). To decrypt C, compute P = Cd(mod n). It can be proven that for all P in the specified range, the en- cryption and decryption functions are inverses. To perform the encryption, you need e and n. To perform the decryption, you need d and n. Therefore, the public key consists of the pair (e, n) and the private key consists of (d, n).

SEC. 8.6 PUBLIC-KEY ALGORITHMS 789

The security of the method is based on the difficulty of factoring large num- bers. If the cryptanalyst could factor the (publicly known) n, he could then find p and q, and from these z. Equipped with knowledge of z and e, d can be found using Euclid’s algorithm. Fortunately, mathematicians have been trying to factor large numbers for at least 300 years, and the accumulated evidence suggests that it is an exceedingly difficult problem.

At the time, Rivest and colleagues concluded that factoring a 500-digit num- 25 years using brute force. In both cases, they assumed the

ber would require 10

best-known algorithm and a computer with a 1-µsec instruction time. With a mil lion chips running in parallel, each with an instruction time of 1 nsec, it would still 16 years. Even if computers continue to get faster by an order of magnitude

take 10

per decade, it will be many years before factoring a 500-digit number becomes fea- sible, at which time our descendants can simply choose p and q still larger. Howev- er, it will probably not come as a surprise that the attacks have made progress and are now significantly faster.

A trivial pedagogical example of how the RSA algorithm works is given in Fig. 8-19. For this example, we have chosen p = 3 and q = 11, giving n = 33 and z = 20 (since(3 < 1) × (11 < 1) = 20). A suitable value for d is d = 7, since 7 and 20 have no common factors. With these choices, e can be found by solving the

equation 7e = 1 (mod 20), which yields e = 3. The ciphertext, C, corresponding to a plaintext message, P, is given by C = P3(mod 33). The ciphertext is de- crypted by the receiver by making use of the rule P = C7(mod 33). The figure shows the encryption of the plaintext ‘‘SUZANNE’’ as an example.

Plaintext (P) Ciphertext (C) After decryption

Symbolic

Numeric

P3

P3 (mod 33) C7 (mod 33)

S

19

U

21

6859 9261

28 21

C7

13492928512 1801088541

19 21

Symbolic

S

U

Z

26

17576

20

1280000000

26

Z

A

01

N

14

N

14

E

05

1

2744 2744 125

1 5 5 26

1

78125

78125

8031810176

01 14 14 05

A N N E

Sender's computation Receiver's computation

Figure 8-19. An example of the RSA algorithm.

Because the primes chosen for this example are so small, P must be less than 33, so each plaintext block can contain only a single character. The result is a monoalphabetic substitution cipher, not very impressive. If instead we had chosen p and q 5 2512, we would have n 5 21024, so each block could be up to 1024 bits or 128 eight-bit characters, versus 8 characters for DES and 16 characters for AES.

790 NETWORK SECURITY CHAP. 8

It should be pointed out that using RSA as we have described is similar to using a symmetric algorithm in ECB mode—the same input block gives the same output block. Therefore, some form of chaining is needed for data encryption. However, in practice, most RSA-based systems use public-key cryptography pri- marily for distributing one-time 128- or 256-bit session keys for use with some symmetric-key algorithm such as AES. RSA is too slow for actually encrypting large volumes of data but is widely used for key distribution.

8.6.2 Other Public-Key Algorithms

Although RSA is still widely used, it is by no means the only public-key algo rithm known. The first public-key algorithm was the knapsack algorithm (Merkle and Hellman, 1978). The idea here is that someone owns a very large number of objects, each with a different weight. The owner encodes the message by secretly selecting a subset of the objects and placing them in the knapsack. The total weight of the objects in the knapsack is made public, as is the list of all possible objects and their corresponding weights. The list of objects in the knapsack is kept secret. With certain additional restrictions, the problem of figuring out a possible list of objects with the given weight was thought to be computationally infeasible and formed the basis of the public-key algorithm.

The algorithm’s inventor, Ralph Merkle, was quite sure that this algorithm could not be broken, so he offered a $100 reward to anyone who could break it. Adi Shamir (the ‘‘S’’ in RSA) promptly broke it and collected the reward. Unde terred, Merkle strengthened the algorithm and offered a $1000 reward to anyone who could break the new one. Ronald Rivest (the ‘‘R’’ in RSA) promptly broke the new one and collected the reward. Merkle did not dare offer $10,000 for the next version, so ‘‘A’’ (Leonard Adleman) was out of luck. Nevertheless, the knap- sack algorithm is not considered secure and is not used in practice any more.

Other public-key schemes are based on the difficulty of computing discrete logarithms or on elliptic curves (Menezes and Vanstone, 1993). Algorithms that use discrete algorithms have been invented by El Gamal (1985) and Schnorr (1991). Elliptic curves, meanwhile are based on a branch of mathematics that is not so well-known except among the elliptic curve illuminati.

A few other schemes exist, but those based on the difficulty of factoring large numbers, computing discrete logarithms modulo a large prime, and elliptic curves, are by far the most important. These problems are thought to be genuinely difficult to solve—mathematicians have been working on them for many years without any great breakthroughs. Elliptic curves in particular enjoy a lot of interest because the elliptic curve discrete algorithm problems are even harder than those of factoriza tion. The Dutch mathematician Arjen Lenstra proposed a way to compare crypto- graphic algorithms by computing how much energy you need to break them. According to this calculation, breaking a 228-bit RSA key takes the energy equiv- alent to that needed to boil less than a teaspoon of water. Breaking an elliptic curve

SEC. 8.6 PUBLIC-KEY ALGORITHMS 791

of that length would require as much energy as you would need to boil all the wa ter on the planet. Paraphrasing Lenstra: with all water evaporated, including that in the bodies of would-be code breakers, the problem would run out of steam.

8.7 DIGITAL SIGNATURES

The authenticity of many legal, financial, and other documents is determined by the presence or absence of an authorized handwritten signature. And photocop ies do not count. For computerized message systems to replace the physical tran- sport of paper-and-ink documents, a method must be found to allow documents to be signed in an unforgeable way.

The problem of devising a replacement for handwritten signatures is a difficult one. Basically, what is needed is a system by which one party can send a signed message to another party in such a way that the following conditions hold:

1. The receiver can verify the claimed identity of the sender.

2. The sender cannot later repudiate the contents of the message. 3. The receiver cannot possibly have concocted the message himself.

The first requirement is needed, for example, in financial systems. When a cus tomer’s computer orders a bank’s computer to buy a ton of gold, the bank’s com- puter needs to be able to make sure that the computer giving the order really be longs to the customer whose account is to be debited. In other words, the bank has to authenticate the customer (and the customer has to authenticate the bank).

The second requirement is needed to protect the bank against fraud. Suppose that the bank buys the ton of gold, and immediately thereafter the price of gold drops sharply. A dishonest customer might then proceed to sue the bank, claiming that he never issued any order to buy gold. When the bank produces the message in court, the customer may deny having sent it. The property that no party to a contract can later deny having signed it is called nonrepudiation. The digital sig- nature schemes that we will now study help provide it.

The third requirement is needed to protect the customer in the event that the price of gold shoots up and the bank tries to construct a signed message in which the customer asked for one bar of gold instead of one ton. In this fraud scenario, the bank just keeps the rest of the gold for itself.

8.7.1 Symmetric-Key Signatures

One approach to digital signatures is to have a central authority that knows everything and whom everyone trusts, say, Big Brother (BB). Each user then chooses a secret key and carries it by hand to BB’s office. Thus, only Alice and BB

792 NETWORK SECURITY CHAP. 8

know Alice’s secret key, KA, and so on. In case you get lost with all notations, with symbols and subscripts, have a look at Fig. 8-20 which summarizes the most im- portant notations for this and subsequent sections.

Term Description

A Alice (sender)

B Bob the Banker (recipient)

P Plaintext message Alice wants to send

BB Big Brother (a trusted central authority)

t Timestamp (to ensure freshness)

RA Random number chosen by Alice

Symmetric key

KA Alice’s secret key (analogous for KB, KBB, etc.)

KA(M ) Message M encrypted/decrypted with Alice’s secret key

Asymmetric keys

DA Alice’s private key (analogous for DB, etc.)

EA Alice’s public key (analogous for EB, etc.)

DA(M) Message M encrypted/decrypted with Alice’s private key

EA(M) Message M encrypted/decrypted with Alice’s public key

Digest

MD(P) Message Digest of plaintext P)

Figure 8-20. Alice wants to send a message to her banker: a legend to keys and symbols

When Alice wants to send a signed plaintext message, P, to her banker, Bob, she generates KA(B, RA, t, P), where B is Bob’s identity, RAis a random number chosen by Alice, t is a timestamp to ensure freshness, and KA(B, RA, t, P) is the message encrypted with her key, KA. Then she sends it as depicted in Fig. 8-21. BB sees that the message is from Alice, decrypts it, and sends a message to Bob as shown. The message to Bob contains the plaintext of Alice’s message and also the signed message KBB(A, t, P). Bob now carries out Alice’s request.

1

A, KA(B, RA, t, P)

e

c

i

l

A

B

B

2

KB (A, RA, t, P, KBB (A, t, P))

Figure 8-21. Digital signatures with Big Brother.

b o

B

What happens if Alice later denies sending the message? Step 1 is that every- one sues everyone (at least, in the United States). Finally, when the case comes to court and Alice vigorously denies sending Bob the disputed message, the judge

SEC. 8.7 DIGITAL SIGNATURES 793

will ask Bob how he can be sure that the disputed message came from Alice and not from Trudy. Bob first points out that BB will not accept a message from Alice unless it is encrypted with KA, so there is no possibility of Trudy sending BB a false message from Alice without BB detecting it immediately.

Bob then dramatically produces Exhibit A: KBB(A, t, P). Bob says that this is a message signed by BB that proves Alice sent P to Bob. The judge then asks BB (whom everyone trusts) to decrypt Exhibit A. When BB testifies that Bob is telling the truth, the judge decides in favor of Bob. Case dismissed.

One potential problem with the signature protocol of Fig. 8-21 is Trudy replay ing either message. To minimize this problem, timestamps are used throughout. Furthermore, Bob can check all recent messages to see if RA was used in any of them. If so, the message is discarded as a replay. Note that based on the time- stamp, Bob will reject very old messages. To guard against instant replay attacks,

Bob just checks the RA of every incoming message to see if such a message has been received from Alice in the past hour. If not, Bob can safely assume this is a new request.

8.7.2 Public-Key Signatures

A structural problem with using symmetric-key cryptography for digital signa tures is that everyone has to agree to trust Big Brother. Furthermore, Big Brother gets to read all signed messages. The most logical candidates for running the Big Brother server are the government, the banks, the accountants, and the lawyers. Unfortunately, none of these inspire total confidence in all citizens. Hence, it would be nice if signing documents did not require a trusted authority.

Fortunately, public-key cryptography can make an important contribution in this area. Let us assume that the public-key encryption and decryption algorithms have the property that E(D(P)) = P, in addition, of course, to the usual property that D(E(P)) = P. (RSA has this property, so the assumption is not unreasonable.) Assuming that this is the case, Alice can send a signed plaintext message, P, to Bob by transmitting EB(DA(P)). Note carefully that Alice knows her own (pri- vate) key, DA, as well as Bob’s public key, EB, so constructing this message is something Alice can do.

When Bob receives the message, he transforms it using his private key, as usual, yielding DA(P), as shown in Fig. 8-22. He stores this text in a safe place and then applies EA to get the original plaintext. To see how the signature property works, suppose that Alice subsequently denies having sent the message P to Bob. When the case comes up in court, Bob can produce both P and DA(P). The judge can easily verify that Bob indeed has a valid message encrypted by DA by simply applying EAto it. Since Bob does not know what Alice’s private key is, the only way Bob could have acquired a message encrypted by it is if Alice did indeed send it. While in jail for perjury and fraud, Alice will have much time to devise interesting new public-key algorithms.

794 NETWORK SECURITY CHAP. 8 Transmission line Alice's computer Bob's computer

Alice's

Bob's

Bob's

Alice's

P P

private key, DA

public key, EB

private key, DB

public key, EA

DA(P) EB (DA(P)) DA(P)

Figure 8-22. Digital signatures using public-key cryptography.

Although using public-key cryptography for digital signatures is an elegant scheme, there are problems that are related to the environment in which they oper- ate rather than to the basic algorithm. For one thing, Bob can prove that a message was sent by Alice only as long as DA remains secret. If Alice discloses her secret key, the argument no longer holds because anyone could have sent the message, in- cluding Bob himself.

The problem might arise, for example, if Bob is Alice’s stockbroker. Suppose that Alice tells Bob to buy a certain stock or bond. Immediately thereafter, the price drops sharply. To repudiate her message to Bob, Alice runs to the police claiming that her home was burglarized and the computer holding her key was

stolen. Depending on the laws in her state or country, she may or may not be legally liable, especially if she claims not to have discovered the break-in until get ting home from work, several hours after it allegedly happened.

Another problem with the signature scheme is what happens if Alice decides to change her key. Doing so is clearly legal, and it is probably a good idea to do so periodically. If a court case later arises, as described above, the judge will apply the current EA to DA(P) and discover that it does not produce P. Bob will look pretty stupid at this point.

In principle, any public-key algorithm can be used for digital signatures. The de facto industry standard is the RSA algorithm. Many security products use it. However, in 1991, NIST proposed using a variant of the El Gamal public-key algo rithm for its new Digital Signature Standard (DSS). El Gamal gets its security from the difficulty of computing discrete logarithms, rather than from the difficulty of factoring large numbers.

As usual when the government tries to dictate cryptographic standards, there was an uproar. DSS was criticized for being

1. Too secret (NSA designed the protocol for using El Gamal).

2. Too slow (10 to 40 times slower than RSA for checking signatures). 3. Too new (El Gamal had not yet been thoroughly analyzed).

4. Too insecure (fixed 512-bit key).

In a subsequent revision, the fourth point was rendered moot when keys up to 1024 bits were allowed. Nevertheless, the first two points remain valid.

SEC. 8.7 DIGITAL SIGNATURES 795 8.7.3 Message Digests

One criticism of signature methods is that they often couple two distinct func tions: authentication and secrecy. Often, authentication is needed but secrecy is not always needed. Also, getting an export license is often easier if the system in ques tion provides only authentication but not secrecy. Below we will describe an authentication scheme that does not require encrypting the entire message.

This scheme is based on the idea of a one-way hash function that takes an arbi trarily long piece of plaintext and from it computes a fixed-length bit string. This hash function, MD, often called a message digest, has four important properties:

1. Given P, it is easy to compute MD(P).

2. Given MD(P), it is effectively impossible to find P.

3. Given P, no one can find Pv such that MD(Pv) = MD(P).

4. A change to the input of even 1 bit produces a very different output.

To meet criterion 3, the hash should be at least 128 bits long, preferably more. To meet criterion 4, the hash must mangle the bits very thoroughly, not unlike the symmetric-key encryption algorithms we have seen.

Computing a message digest from a piece of plaintext is much faster than en- crypting that plaintext with a public-key algorithm, so message digests can be used to speed up digital signature algorithms. To see how this works, consider the sig- nature protocol of Fig. 8-21 again. Instead, of signing P with KBB(A, t, P), BB now computes the message digest by applying MD to P, yielding MD(P). BB then encloses KBB(A, t, MD(P)) as the fifth item in the list encrypted with KBthat is sent to Bob, instead of KBB(A, t, P). If a dispute arises, Bob can produce both P and KBB(A, t, MD(P)). After Big Brother has decrypted it for the judge, Bob has MD(P), which is guaranteed to be genuine, and the alleged P. However, since it is effectively impossible for Bob to find any other message that gives this hash, the judge will easily be convinced that Bob is telling the truth. Using message digests in this way saves both encryption time and message transport costs.

Message digests work in public-key cryptosystems, too, as shown in Fig. 8-23. Here, Alice first computes the message digest of her plaintext. She then signs the message digest and sends both the signed digest and the plaintext to Bob. If Trudy replaces P along the way, Bob will see this when he computes MD(P).

SHA-1, SHA-2 and SHA-3

A variety of message digest functions have been proposed. For a long time, one of the most widely used functions was SHA-1 (Secure Hash Algorithm 1) (NIST, 1993). Before we commence our explanation, it is important to realize that

796 NETWORK SECURITY CHAP. 8

e

c

i

l

A

P, DA (MD (P))

b o

B

Figure 8-23. Digital signatures using message digests.

SHA-1 has been broken since 2017 and is now being phased out by many systems, but more about this later. Like all message digests, SHA-1 operates by mangling bits in a sufficiently complicated way that every output bit is affected by every input bit. SHA-1 was developed by NSA and blessed by NIST in FIPS 180-1. It processes input data in 512-bit blocks, and it generates a 160-bit message digest. A typical way for Alice to send a nonsecret but signed message to Bob is illustrat- ed in Fig. 8-24. Here, her plaintext message is fed into the SHA-1 algorithm to get a 160-bit SHA-1 hash. Alice then signs the hash with her RSA private key and sends both the plaintext message and the signed hash to Bob.

Alice's

Alice's plaintext

private key, DA

160-Bit SHA-1

message M

hash of M

SHA-1

Signed hash

RSA

(arbitrary length)

algorithmH

DA(H)

algorithm

Sent

to Bob

Figure 8-24. Use of SHA-1 and RSA for signing nonsecret messages.

After receiving the message, Bob computes the SHA-1 hash himself and also applies Alice’s public key to the signed hash to get the original hash, H. If the two agree, the message is considered valid. Since there is no way for Trudy to modify the (plaintext) message while it is in transit and produce a new one that hashes to H, Bob can easily detect any changes Trudy has made to the message. For mes-

sages whose integrity is important but whose contents are not secret, the scheme of Fig. 8-24 is widely used. For a relatively small cost in computation, it guarantees that any modifications made to the plaintext message in transit can be detected with very high probability.

New versions of SHA-1 have been developed that produce hashes of 224, 256, 384, and 512 bits, respectively. Collectively, these versions are called SHA-2. Not only are these hashes longer than SHA-1 hashes, but the digest function has been changed to combat some potential weaknesses of SHA-1. The weaknesses are ser ious. In 2017, SHA-1 was broken by a team of researchers from Google and the

SEC. 8.7 DIGITAL SIGNATURES 797

CWI research center in Amsterdam. Specifically, the researchers were able to gen- erate hash collisions, essentially killing the security of SHA-1. Not surprisingly, the attack led to an increased interest in SHA-2.

In 2006, the National Institute of Standards and Technology (NIST) started organizing a competition for a new hash standard, which is now known as SHA-3. The competition closed in 2012. Three years later, the new SHA-3 standard (‘‘Keccak’’) was officially published. Interestingly, NIST does not suggest that we all dump SHA-2 in the trash and switch to SHA-3 because there are no successful attacks on SHA-2 yet. Even so, it is good to have a drop-in replacement lying around, just in case.

8.7.4 The Birthday Attack

In the world of crypto, nothing is ever what it seems to be. One might think that it would take on the order of 2moperations to subvert an m-bit message digest. In fact, 2m/2 operations will often do using a birthday attack, in an approach pub lished by Yuval (1979) in his now-classic paper ‘‘How to Swindle Rabin.’’

Remember, from our earlier discussion of the DNS birthday attack that if there is some mapping between inputs and outputs with n inputs (people, messages, etc.) and k possible outputs (birthdays, message digests, etc.), there are n(n < 1)/2 input pairs. If n(n < 1)/2 > k, the chance of having at least one match is pretty good. Thus, approximately, a match is likely for n > 3}k. This result means that a 64-bit message digest can probably be broken by generating about 232 messages and looking for two with the same message digest.

Let us look at a practical example. The Department of Computer Science at State University has one position for a tenured faculty member and two candidates, Tom and Dick. Tom was hired two years before Dick, so he goes up for review first. If he gets it, Dick is out of luck. Tom knows that the department chairperson, Marilyn, thinks highly of his work, so he asks her to write him a letter of recom- mendation to the Dean, who will decide on Tom’s case. Once sent, all letters be- come confidential.

Marilyn tells her secretary, Ellen, to write the Dean a letter, outlining what she wants in it. When it is ready, Marilyn will review it, compute and sign the 64-bit digest, and send it to the Dean. Ellen can send the letter later by email.

Unfortunately for Tom, Ellen is romantically involved with Dick and would like to do Tom in, so she writes the following letter with the 32 bracketed options:

Dear Dean Smith,

This [letter | message] is to give my [honest | frank] opinion of Prof. Tom Wil- son, who is [a candidate | up] for tenure [now | this year]. I have [known | worked with] Prof. Wilson for [about | almost] six years. He is an [outstanding | excellent] researcher of great [talent | ability] known [worldwide | internationally] for his [brilliant | creative] insights into [many | a wide variety of] [dif icult | challenging] problems.

798 NETWORK SECURITY CHAP. 8

He is also a [highly | greatly] [respected | admired] [teacher | educator]. His students give his [classes | courses] [rave | spectacular] reviews. He is [our | the Department’s] [most popular | best-loved] [teacher | instructor].

[In addition | Additionally] Prof. Wilson is a [gifted | ef ective] fund raiser. His [grants | contracts] have brought a [large | substantial] amount of money into [the | our] Department. [This money has | These funds have] [enabled | permitted] us to [pursue | carry out] many [special | important] programs, [such as | for example] your State 2025 program. Without these funds we would [be unable | not be able] to continue this program, which is so [important | essential] to both of us. I strongly urge you to grant him tenure.

Unfortunately for Tom, as soon as Ellen finishes composing and typing in this let ter, she also writes a second one:

Dear Dean Smith,

This [letter | message] is to give my [honest | frank] opinion of Prof. Tom Wil- son, who is [a candidate | up] for tenure [now | this year]. I have [known | worked with] Tom for [about | almost] six years. He is a [poor | weak] researcher not well known in his [field | area]. His research [hardly ever | rarely] shows [insight in | understanding of] the [key | major] problems of [the | our] day.

Furthermore, he is not a [respected | admired] [teacher | educator]. His stu- dents give his [classes | courses] [poor | bad ] reviews. He is [our | the Depart- ment’s] least popular [teacher | instructor], known [mostly | primarily] within [the | our] Department for his [tendency | propensity] to [ridicule | embarrass] students

[foolish | imprudent] enough to ask questions in his classes.

[In addition | Additionally] Tom is a [poor | marginal] fund raiser. His [grants | contracts] have brought only a [meager | insignificant] amount of money into [the | our] Department. Unless new [money is | funds are] quickly located, we may have to cancel some essential programs, such as your State 2025 program. Unfortunate ly, under these [conditions | circumstances] I cannot in good [conscience | faith] recommend him to you for [tenure | a permanent position].

32 message digests of each let

Now Ellen programs her computer to compute the 2

ter overnight. Chances are, one digest of the first letter will match one digest of the second. If not, she can add a few more options and try again tonight. Suppose that she finds a match. Call the ‘‘good’’ letter A and the ‘‘bad’’ one B.

Ellen now emails letter A to Marilyn for approval. Letter B she keeps secret, showing it to no one. Marilyn, of course, approves it, computes her 64-bit message digest, signs the digest, and emails the signed digest off to Dean Smith. Indepen- dently, Ellen emails letter B to the Dean (not letter A, as she is supposed to).

After getting the letter and signed message digest, the Dean runs the message digest algorithm on letter B, sees that it agrees with what Marilyn sent him, and fires Tom. The Dean does not realize that Ellen managed to generate two letters with the same message digest and sent her a different one than the one Marilyn saw and approved. (Optional ending: Ellen tells Dick what she did. Dick is appalled

SEC. 8.7 DIGITAL SIGNATURES 799

and breaks off the affair. Ellen is furious and confesses to Marilyn. Marilyn calls the Dean. Tom gets tenure after all.) With SHA-2, the birthday attack is difficult because even at the ridiculous speed of 1 trillion digests per second, it would take over 32,000 years to compute all 280 digests of two letters with 80 variants each, and even then a match is not guaranteed. However, with a cloud of 1,000,000 chips working in parallel, 32,000 years becomes 2 weeks.

8.8 MANAGEMENT OF PUBLIC KEYS

Public-key cryptography makes it possible for people who do not share a com- mon key in advance to nevertheless communicate securely. It also makes signing messages possible without the existence of a trusted third party. Finally, signed message digests make it possible for the recipient to verify the integrity of received messages easily and securely.

However, there is one problem that we have glossed over a bit too quickly: if Alice and Bob do not know each other, how do they get each other’s public keys to start the communication process? The obvious solution—put your public key on your Web site—does not work, for the following reason. Suppose that Alice wants to look up Bob’s public key on his Web site. How does she do it? She starts by typing in Bob’s URL. Her browser then looks up the DNS address of Bob’s home page and sends it a GET request, as shown in Fig. 8-25. Unfortunately, Trudy intercepts the request and replies with a fake home page, probably a copy of Bob’s home page except for the replacement of Bob’s public key with Trudy’s public key. When Alice now encrypts her first message with ET , Trudy decrypts it, reads it, re-encrypts it with Bob’s public key, and sends it to Bob, who is none the wiser that Trudy is reading his incoming messages. Worse yet, Trudy could modify the messages before reencrypting them for Bob. Clearly, some mechanism is needed to make sure that public keys can be exchanged securely.

1. GET Bob's home page

2. Fake home page with ET

Alice Trudy

Bob

3. ET(Message)

4. EB(Message)

Figure 8-25. A way for Trudy to subvert public-key encryption.

8.8.1 Certificates

As a first attempt at distributing public keys securely, we could imagine a KDC (Key Distribution Center) available online 24 hours a day to provide public keys on demand. One of the many problems with this solution is that it is not

800 NETWORK SECURITY CHAP. 8

scalable, and the key distribution center would rapidly become a bottleneck. Also, if it ever went down, Internet security would suddenly grind to a halt. For these reasons, people have developed a different solution, one that does not require the key distribution center to be online all the time. In fact, it does not have to be online at all. Instead, what it does is certify the public keys belonging to peo- ple, companies, and other organizations. An organization that certifies public keys is now called a CA (Certification Authority).

As an example, suppose that Bob wants to allow Alice and other people he does not know to communicate with him securely. He can go to the CA with his public key along with his passport or driver’s license and ask to be certified. The CA then issues a certificate similar to the one in Fig. 8-26 and signs its SHA-2 hash with the CA’s private key. Bob then pays the CA’s fee and gets a document containing the certificate and its signed hash (ideally not sent over unreliable chan- nels).

I hereby certify that the public key

19836A8B03030CF83737E3837837FC3s87092827262643FFA82710382828282A belongs to

Robert John Smith

12345 University Avenue

Berkeley, CA 94702

Birthday: July 4, 1958

Email: bob@superdupernet.com

SHA-2 hash of the above certificate signed with the CA’s private key

Figure 8-26. A possible certificate and its signed hash.

The fundamental job of a certificate is to bind a public key to the name of a principal (individual, company, etc.). Certificates themselves are not secret or pro tected. Bob might, for example, decide to put his new certificate on his Web site, with a link on the main page saying: click here for my public-key certificate. The resulting click would return both the certificate and the signature block (the signed SHA-2 hash of the certificate).

Now let us run through the scenario of Fig. 8-25 again. When Trudy intercepts Alice’s request for Bob’s home page, what can she do? She can put her own certif icate and signature block on the fake page, but when Alice reads the contents of the certificate she will immediately see that she is not talking to Bob because Bob’s name is not in it. Trudy can modify Bob’s home page on the fly, replacing Bob’s public key with her own. However, when Alice runs the SHA-2 algorithm on the certificate, she will get a hash that does not agree with the one she gets when she applies the CA’s well-known public key to the signature block. Since Trudy does not have the CA’s private key, she has no way of generating a signature block that contains the hash of the modified Web page with her public key on it. In this way, Alice can be sure she has Bob’s public key and not Trudy’s or someone else’s.

SEC. 8.8 MANAGEMENT OF PUBLIC KEYS 801

And as we promised, this scheme does not require the CA to be online for verifica tion, thus eliminating a potential bottleneck.

While the standard function of a certificate is to bind a public key to a princi- pal, a certificate can also be used to bind a public key to an attribute. For ex- ample, a certificate could say: ‘‘This public key belongs to someone over 18.’’ It could be used to prove that the owner of the private key was not a minor and thus allowed to access material not suitable for children, and so on, but without disclos ing the owner’s identity. Typically, the person holding the certificate would send it to the Web site, principal, or process that cared about age. That site, principal, or process would then generate a random number and encrypt it with the public key in the certificate. If the owner were able to decrypt it and send it back, that would be proof that the owner indeed had the attribute stated in the certificate. Alternatively, the random number could be used to generate a session key for the ensuing conver- sation.

Another example of where a certificate might contain an attribute is in an ob ject-oriented distributed system. Each object normally has multiple methods. The owner of the object could provide each customer with a certificate giving a bit map of which methods the customer is allowed to invoke and binding the bit map to a public key using a signed certificate. Again, if the certificate holder can prove possession of the corresponding private key, he will be allowed to perform the methods in the bit map. This approach has the property that the owner’s identity need not be known, a property useful in situations where privacy is important.

8.8.2 X.509

If everybody who wanted something signed went to the CA with a different kind of certificate, managing all the different formats would soon become a prob lem. To solve this problem, a standard for certificates has been devised and approved by the International Telecommunication Union (ITU). The standard is called X.509 and is in widespread use on the Internet. It has gone through three versions since the initial standardization in 1988. We will discuss version 3.

X.509 has been heavily influenced by the OSI world, borrowing some of its worst features (e.g., naming and encoding). Surprisingly, IETF went along with X.509, even though in nearly every other area, from machine addresses to transport protocols to email formats, IETF generally ignored OSI and tried to do it right. The IETF version of X.509 is described in RFC 5280.

At its core, X.509 is a way to describe certificates. The primary fields in a cer tificate are listed in Fig. 8-27. The descriptions given there should provide a gener- al idea of what the fields do. For additional information, please consult the stan- dard itself or RFC 2459.

For example, if Bob works in the loan department of the Money Bank, his X.500 address might be

/C=US/O=MoneyBank/OU=Loan/CN=Bob/

802 NETWORK SECURITY CHAP. 8

Field Meaning

Version Which version of X.509

Serial number This number plus the CA’s name uniquely identifies the certificate Signature algorithm The algorithm used to sign the certificate

Issuer X.500 name of the CA

Validity period The starting and ending times of the validity period Subject name The entity whose key is being certified

Public key The subject’s public key and the ID of the algorithm using it Issuer ID An optional ID uniquely identifying the certificate’s issuer Subject ID An optional ID uniquely identifying the certificate’s subject Extensions Many extensions have been defined

Signature The certificate’s signature (signed by the CA’s private key) Figure 8-27. The basic fields of an X.509 certificate.

where C is for country, O is for organization, OU is for organizational unit, and CN is for common name. CAs and other entities are named in a similar way. A substantial problem with X.500 names is that if Alice is trying to contact bob@moneybank.com and is given a certificate with an X.500 name, it may not be obvious to her that the certificate refers to the Bob she wants. Fortunately, starting with version 3, DNS names are now permitted instead of X.500 names, so this problem may eventually vanish.

Certificates are encoded using OSI ASN.1 (Abstract Syntax Notation 1), which is sort of like a struct in C, except with an extremely peculiar and verbose notation. More information about X.509 is given by Ford and Baum (2000).

8.8.3 Public Key Infrastructures

Having a single CA to issue all the world’s certificates obviously would not work. It would collapse under the load and be a central point of failure as well. A possible solution might be to have multiple CAs, all run by the same organization and all using the same private key to sign certificates. While this would solve the load and failure problems, it introduces a new problem: key leakage. If there were dozens of servers spread around the world, all holding the CA’s private key, the chance of the private key being stolen or otherwise leaking out would be greatly in- creased. Since the compromise of this key would ruin the world’s electronic secu rity infrastructure, having a single central CA is very risky.

In addition, which organization would operate the CA? It is hard to imagine any authority that would be accepted worldwide as legitimate and trustworthy. In some countries, people would insist that it be a government, while in other coun tries they would insist that it not be a government.

SEC. 8.8 MANAGEMENT OF PUBLIC KEYS 803

For these reasons, a different way for certifying public keys has evolved. It goes under the general name of PKI (Public Key Infrastructure). In this section, we will summarize how it works in general, although there have been many pro- posals, so the details will probably evolve in time.

A PKI has multiple components, including users, CAs, certificates, and direc tories. What the PKI does is provide a way of structuring these components and define standards for the various documents and protocols. A particularly simple form of PKI is a hierarchy of CAs, as depicted in Fig. 8-28. In this example, we have shown three levels, but in practice, there might be fewer or more. The top level CA, the root, certifies second-level CAs, which we here call RAs (Regional Authorities) because they might cover some geographic region, such as a country or continent. This term is not standard, though; in fact, no term is really standard for the different levels of the tree. These, in turn, certify the real CAs, which issue the X.509 certificates to organizations and individuals. When the root authorizes a new RA, it generates an X.509 certificate stating that it has approved the RA, in- cludes the new RA’s public key in it, signs it, and hands it to the RA. Similarly, when an RA approves a new CA, it produces and signs a certificate stating its approval and containing the CA’s public key.

RA 2 is approved.

Its public key is

47383AE349. . .

Root's signature

RA 1

Root RA 2 is approved.

Its public key is

47383AE349. . .

Root'ssignature

RA 2

CA 1 CA 2

CA 5 is approved. Its public key is 6384AF863B. . .

RA 2's signature

CA 5 is approved. Its public key is 6384AF863B. . .

RA 2's signature

CA 3 CA 4 CA 5

(a) (b)

Figure 8-28. (a) A hierarchical PKI. (b) A chain of certificates.

Our PKI works like this. Suppose that Alice needs Bob’s public key in order to communicate with him, so she looks for and finds a certificate containing it, signed by CA 5. But Alice has never heard of CA 5. For all she knows, CA 5 might be Bob’s 10-year-old daughter. She could go to CA 5 and say: ‘‘Prove your legitimacy.’’ CA 5 will respond with the certificate it got from RA 2, which con tains CA 5’s public key. Now armed with CA 5’s public key, she can verify that Bob’s certificate was indeed signed by CA 5 and isthus legal.

Unless RA 2 is Bob’s 12-year-old son. So, the next step is for her to ask RA 2 to prove it is legitimate. The response to her query is a certificate signed by the root and containing RA 2’spublic key. Now Alice issure she has Bob’s public key.

804 NETWORK SECURITY CHAP. 8

But how does Alice find the root’s public key? Magic. It is assumed that everyone knows the root’s public key. For example, her browser might have been shipped with the root’s public key built in.

Bob is a friendly sort of guy and does not want to cause Alice a lot of work. He knows that she will have to check out CA 5 and RA 2, so to save her some trou- ble, he collects the two needed certificates and gives her the two certificates along with his. Now she can use her own knowledge of the root’s public key to verify the top-level certificate and the public key contained therein to verify the second one. Alice does not need to contact anyone to do the verification. Because the certifi- cates are all signed, she can easily detect any attempts to tamper with their con tents. A chain of certificates going back to the root like this is sometimes called a chain of trust or a certification path. The technique is widely used in practice.

Of course, we still have the problem of who is going to run the root. The solu tion is not to have a single root, but to have many roots, each with its own RAs and CAs. In fact, modern browsers come preloaded with the public keys for over 100 roots, sometimes referred to as trust anchors. In this way, having a single world- wide trusted authority can be avoided.

But there is now the issue of how the browser vendor decides which purported trust anchors are reliable and which are sleazy. It all comes down to the user trust ing the browser vendor to make wise choices and not simply approve all trust anchors willing to pay its inclusion fee. Most browsers allow users to inspect the root keys (usually in the form of certificates signed by the root) and delete any that seem shady. For more information on PKIs, see Stapleton and Epstein (2016).

Directories

Another issue for any PKI is where certificates (and their chains back to some known trust anchor) are stored. One possibility is to have each user store his or her own certificates. While doing this is safe (i.e., there is no way for users to tamper with signed certificates without detection), it is also inconvenient. One alternative that has been proposed is to use DNS as a certificate directory. Before contacting Bob, Alice probably has to look up his IP address using DNS, so why not have DNS return Bob’s entire certificate chain along with his IP address?

Some people think this is the way to go, but others would prefer dedicated di rectory servers whose only job is managing X.509 certificates. Such directories could provide lookup services by using properties of the X.500 names. For ex- ample, in theory, such a directory service could answer queries like ‘‘Give me a list of all people named Alice who work in sales departments anywhere in the U.S.’’

Revocation

The real world is full of certificates, too, such as passports and drivers’ licenses. Sometimes these certificates can be revoked, for example, drivers’ licenses can be revoked for drunken driving and other driving offenses. The same

SEC. 8.8 MANAGEMENT OF PUBLIC KEYS 805

problem occurs in the digital world: the grantor of a certificate may decide to revoke it because the person or organization holding it has abused it in some way. It can also be revoked if the subject’s private key has been exposed or, worse yet, the CA’s private key has been compromised. Thus, a PKI needs to deal with the issue of revocation. The possibility of revocation complicates matters.

A first step in this direction is to have each CA periodically issue a CRL (Cer tificate Revocation List) giving the serial numbers of all certificates that it has revoked. Since certificates contain expiry times, the CRL need only contain the serial numbers of certificates that have not yet expired. Once its expiry time has passed, a certificate is automatically invalid, so no distinction is needed between those that just timed out and those that were actually revoked. In both cases, they cannot be used any more.

Unfortunately, introducing CRLs means that a user who is about to use a cer tificate must now acquire the CRL to see if the certificate has been revoked. If it has been, it should not be used. However, even if the certificate is not on the list, it might have been revoked just after the list was published. Thus, the only way to really be sure is to ask the CA. And on the next use of the same certificate, the CA has to be asked again, since the certificate might have been revoked a few seconds ago.

Another complication is that a revoked certificate could conceivably be rein- stated, for example, if it was revoked for nonpayment of some fee that has since been paid. Having to deal with revocation (and possibly reinstatement) eliminates one of the best properties of certificates, namely, that they can be used without hav ing to contact a CA.

Where should CRLs be stored? A good place would be the same place the cer tificates themselves are stored. One strategy is for the CA to actively push out CRLs periodically and have the directories process them by simply removing the revoked certificates. If directories are not used for storing certificates, the CRLs can be cached at various places around the network. Since a CRL is itself a signed document, if it is tampered with, that tampering can be easily detected.

If certificates have long lifetimes, the CRLs will be long, too. For example, if credit cards are valid for 5 years, the number of revocations outstanding will be much longer than if new cards are issued every 3 months. A standard way to deal with long CRLs is to issue a master list infrequently, but issue updates to it more often. Doing this reduces the bandwidth needed for distributing the CRLs.

8.9 AUTHENTICATION PROTOCOLS

Authentication is the technique by which a process verifies that its communi- cation partner is who it is supposed to be and not an imposter. Verifying the identi ty of a remote process in the face of a malicious, active intruder is surprisingly dif ficult and requires complex protocols based on cryptography. In this section, we

806 NETWORK SECURITY CHAP. 8

will study some of the many authentication protocols that are used on insecure computer networks.

As an aside, some people confuse authorization with authentication. Authen tication deals with the question of whether you are actually communicating with a specific process. Authorization is concerned with what that process is permitted to do. For example, say a client process contacts a file server and says: ‘‘I am Mirte’s process and I want to delete the file cookbook.old.’’ From the file server’s point of view, two questions must be answered:

1. Is this actually Mirte’s process (authentication)?

2. Is Mirte allowed to delete cookbook.old (authorization)?

Only after both of these questions have been unambiguously answered in the affir- mative can the requested action take place. The former question is really the key one. Once the file server knows to whom it is talking, checking authorization is just a matter of looking up entries in local tables or databases. For this reason, we will concentrate on authentication in this section.

The general model that essentially all authentication protocols use is this. Alice starts out by sending a message either to Bob or to a trusted KDC, which is expected to be honest. Several other message exchanges follow in various direc tions. As these messages are being sent, Trudy may intercept, modify, or replay them in order to trick Alice and Bob or just to gum up the works.

Nevertheless, when the protocol has been completed, Alice is sure she is talk ing to Bob and Bob is sure he is talking to Alice. Furthermore, in most of the pro tocols, the two of them will also have established a secret session key for use in the upcoming conversation. In practice, for performance reasons, all data traffic is en- crypted using symmetric-key cryptography (typically AES), although public-key cryptography is widely used for the authentication protocols themselves and for es tablishing the session key.

The point of using a new, randomly chosen session key for each new con- nection is to minimize the amount of traffic that gets sent with the users’ secret keys or public keys, to reduce the amount of ciphertext an intruder can obtain, and to minimize the damage done if a process crashes and its core dump (memory printout after a crash) falls into the wrong hands. Hopefully, the only key present then will be the session key. All the permanent keys should have been carefully zeroed out after the session was established.

8.9.1 Authentication Based on a Shared Secret Key

For our first authentication protocol, we will assume that Alice and Bob al ready share a secret key, KAB. This shared key might have been agreed upon on the telephone or in person, but, in any event, not on the (insecure) network.

SEC. 8.9 AUTHENTICATION PROTOCOLS 807

This protocol is based on a principle found in many authentication protocols: one party sends a random number to the other, who then transforms it in a special way and returns the result. Such protocols are called challenge-response proto- cols. In this and subsequent authentication protocols, the following notation will be used:

A, B are the identities of Alice and Bob.

Ri’s are the challenges, where i identifies the challenger. Ki’s are keys, where i indicates the owner. KSis the session key.

The message sequence for our first shared-key authentication protocol is illus trated in Fig. 8-29. In message 1, Alice sends her identity, A, to Bob in a way that Bob understands. Bob, of course, has no way of knowing whether this message came from Alice or from Trudy, so he chooses a challenge, a large random number, RB, and sends it back to ‘‘Alice’’ as message 2, in plaintext. Alice then encrypts the message with the key she shares with Bob and sends the ciphertext, KAB(RB), back in message 3. When Bob sees this message, he immediately knows that it came from Alice because Trudy does not know KAB and thus could not have gener- ated it. Furthermore, since RB was chosen randomly from a large space (say, 128-bit random numbers), it is very unlikely that Trudy would have seen RB and its response in an earlier session. It is equally unlikely that she could guess the cor rect response to any challenge.

1

A

2

RB

e

c

i

l

A

3 KAB(RB) 4

RA

5

KAB(RA)

b o

B

Figure 8-29. Two-way authentication using a challenge-response protocol.

At this point, Bob is sure he is talking to Alice, but Alice is not sure of any thing. For all Alice knows, Trudy might have intercepted message 1 and sent back RBin response. Maybe Bob died last night. To find out to whom she is talking, Alice picks a random number, RA, and sends it to Bob as plaintext, in message 4. When Bob responds with KAB(RA), Alice knows she is talking to Bob. If they wish to establish a session key now, Alice can pick one, KS, and send it to Bob en- crypted with KAB. The protocol of Fig. 8-29 contains five messages. Let us see if we can be clever and eliminate some of them. One approach is illustrated in Fig. 8-30. Here

808 NETWORK SECURITY CHAP. 8

Alice initiates the challenge-response protocol instead of waiting for Bob to do it. Similarly, while he is responding to Alice’s challenge, Bob sends his own. The en tire protocol can be reduced to three messages instead of five.

1

A, RA

e

c

i

l

A

2 RB,KAB(RA)

3

KAB(RB)

b o

B

Figure 8-30. A shortened two-way authentication protocol.

Is this new protocol an improvement over the original one? In one sense it is: it is shorter. Unfortunately, it is also wrong. Under certain circumstances, Trudy can defeat this protocol by using what is known as a reflection attack. In particu lar, Trudy can break it if it is possible to open multiple sessions with Bob at once. This situation would be true, for example, if Bob is a bank and is prepared to ac- cept many simultaneous connections from automated teller machines at once.

Trudy’s reflection attack is shown in Fig. 8-31. It starts out with Trudy claim ing she is Alice and sending RT . Bob responds, as usual, with his own challenge, RB. Now Trudy is stuck. What can she do? She does not know KAB(RB).

1

y d

u

r

T

A, RT

2 RB, KAB (RT)

3

A, RB

4 RB2,KAB (RB)

5

KAB (RB)

First session

b

o

Second session

B

First session

Figure 8-31. The reflection attack.

She can open a second session with message 3, supplying the RB taken from message 2 as her challenge. Bob calmly encrypts it and sends back KAB(RB) in message 4. We have shaded the messages on the second session to make them stand out. Now Trudy has the missing information, so she can complete the first session and abort the second one. Bob is now convinced that Trudy is Alice, so

SEC. 8.9 AUTHENTICATION PROTOCOLS 809

when she asks for her bank account balance, he gives it to her without question. Then when she asks him to transfer it all to a secret bank account in Switzerland, he does so without a moment’s hesitation.

The moral of this story is:

Designing a correct authentication protocol is much harder than it looks. The following four general rules often help the designer avoid common pitfalls:

1. Have the initiator prove who she is before the responder has to. This avoids Bob giving away valuable information before Trudy has to give any evidence of who she is.

2. Have the initiator and responder use different keys for proof, even if this means having two shared keys, KAB and KvAB.

3. Have the initiator and responder draw their challenges from different sets. For example, the initiator must use even numbers and the re- sponder must use odd numbers.

4. Make the protocol resistant to attacks involving a second parallel ses- sion in which information obtained in one session is used in a dif ferent one.

If even one of these rules is violated, the protocol can frequently be broken. Here, all four rules were violated, with disastrous consequences.

Now let us go take a closer look at Fig. 8-29. Surely that protocol is not sub ject to a reflection attack? Maybe. It is quite subtle. Trudy was able to defeat our protocol by using a reflection attack because it was possible to open a second ses- sion with Bob and trick him into answering his own questions. What would hap- pen if Alice were a general-purpose computer that also accepted multiple sessions,

rather than a person at a computer? Let us take a look what Trudy can do. To see how Trudy’s attack works, see Fig. 8-32. Alice starts out by announc ing her identity in message 1. Trudy intercepts this message and begins her own session with message 2, claiming to be Bob. Again we have shaded the session 2 messages. Alice responds to message 2 by saying in message 3: ‘‘You claim to be Bob? Prove it.’’ At this point, Trudy is stuck because she cannot prove she is Bob. What does Trudy do now? She goes back to the first session, where it is her turn to send a challenge, and sends the RA she got in message 3. Alice kindly re- sponds to it in message 5, thus supplying Trudy with the information she needs to send in message 6 in session 2. At this point, Trudy is basically home free because she has successfully responded to Alice’s challenge in session 2. She can now can- cel session 1, send over any old number for the rest of session 2, and she will have an authenticated session with Alice in session 2.

But Trudy is a perfectionist, and she really wants to show off her considerable skills. Instead, of sending any old number over to complete session 2, she waits

810 NETWORK SECURITY CHAP. 8 1

e

c

i

l

A

A

2

B

3

RA

4

RA

5

KAB (RA)

6 KAB(RA) 7 RA2

8

RA2

9 KAB (RA2) 10 KAB (RA2)

First session

Second session

First session

y

d

u

Second session

r

T

First session

Second session

First session

Figure 8-32. A reflection attack on the protocol of Fig. 8-29.

until Alice sends message 7, Alice’s challenge for session 1. Of course, Trudy does not know how to respond, so she uses the reflection attack again, sending back RA2 as message 8. Alice conveniently encrypts RA2in message 9. Trudy now switches back to session 1 and sends Alice the number she wants in message

10, conveniently copied from what Alice sent in message 9. At this point, Trudy has two fully authenticated sessions with Alice.

This attack has a somewhat different result than the attack on the three-mes- sage protocol that we saw in Fig. 8-31. This time, Trudy has two authenticated connections with Alice. In the previous example, she had one authenticated con- nection with Bob. Again here, if we had applied all the general authentication pro tocol rules discussed earlier, this attack could have been stopped. For a detailed discussion of these kinds of attacks and how to thwart them, see Bird et al. (1993). They also show how it is possible to systematically construct protocols that are provably correct. The simplest such protocol is nevertheless fairly complicated, so we will now show a different class of protocol that also works.

The new authentication protocol is shown in Fig. 8-33 (Bird et al., 1993). It uses a HMAC (Hashed Message Authentication Code) which guarantees the in tegrity and authenticity of a message. A simple, yet powerful HMAC consists of a hash over the message plus the shared key. By sending the HMAC along with the rest of the message, no attacker is able to change or spoof the message: changing any bit would lead to an incorrect hash, and generating a valid hash is not possible without the key. HMACs are attractive because they can be generated very ef ficiently (faster than running SHA-2 and then running RSA on the result).

robot