Economists have not always been so dense about self-control problems. For roughly two centuries, the economists who wrote on this topic knew their Humans. In fact, an early pioneer of what we would now call a behavioral treatment of self-control was none other than the high priest of free market economics: Adam Smith. When most people think about Adam Smith, they think of his most famous work, The Wealth of Nations. This remarkable book— the first edition was published in 1776—created the foundation for modern economic thinking. Oddly, the most well-known phrase in the book, the vaunted “invisible hand,” mentioned earlier, appears only once, treated with a mere flick by Smith. He notes that by pursuing personal profits, the typical businessman is “led by an invisible hand to promote an end which was no part of his intention. Nor is it always the worse for the society that it was no part of it.” Note the guarded language of the second sentence, which is rarely included (or remembered) by those who make use of the famous phrase, or invoke some version of the invisible handwave. “Nor it is always the worse for society” is hardly the same thing as an assertion that things will turn out for the best. The rest of the massive book takes on almost any economics topic one can think of. For example, Smith provided the underlying theory for my PhD thesis, on the value of a life. He explained how workers had to be paid more to compensate them for taking dirty, risky, or unpleasant jobs. The famous Chicago economist George Stigler was fond of saying that there was nothing new in economics; Adam Smith had said it all. The same can be said of much of behavioral economics. The bulk of Smith’s writings on what we would now consider behavioral economics appeared in his earlier book The Theory of Moral Sentiments, published in 1759. It is here that Smith expounded on self-control. Insightfully, he portrayed the topic as a struggle or conflict between our “passions” and what he called our “impartial spectator.” Like most economists who find out that Smith had said it first, I only learned about this formulation after proposing my own version, which we will get to later in this section. The crucial feature of Smith’s conception of our passions is that they are myopic, that is, shortsighted. As he framed it, the problem is that “The pleasure which we are to enjoy ten years hence, interests us so little in comparison with that which we may enjoy to-day.” Adam Smith was not the only early economist to have sensible intuitions about self-control problems. As behavioral economist George Loewenstein has documented, other early treatments of “intertemporal choice”—that is, choices made about the timing of consumption—also stressed the importance of concepts such as “willpower,” a word that had no meaning in the economics being practiced in 1980.* Smith recognized that willpower is necessary to deal with myopia. In 1871, William Stanley Jevons, another economics luminary, refined Smith’s observation about myopia, noting that the preference for present consumption over future consumption diminishes over time. We may care a lot about getting that bowl of ice cream right now rather than tomorrow, but we would scarcely care about a choice between this date next year versus the day before or after. Some early economists viewed any discounting of future consumption as a mistake—a failure of some type. It could be a failure of willpower, or, as Arthur Pigou famously wrote in 1921, it could be a failure of imagination: “Our telescopic faculty is defective and . . . we, therefore, see future pleasures, as it were, on a diminished scale.” Irving Fisher provided the first economic treatment of intertemporal choice that might be considered “modern.” In his 1930 classic, The Theory of Interest, he used what have become the basic teaching tools of microeconomics— indifference curves—to show how an individual will choose between consumption at two different points of time, given a market rate of interest. His theory qualifies as modern both in its tools and in the sense that it is normative. He explains what a rational person should do. But Fisher also made clear that he did not think his theory was a satisfactory descriptive model, because it omitted important behavioral factors. For one thing, Fisher believed that time preference depends on an individual’s level of income, with the poor being more impatient than those who are better off. Furthermore, Fisher emphasized that he viewed the impatient behavior exhibited by low-income workers as partly irrational, which he described with vivid examples: “This is illustrated by the story of the farmer who would never mend his leaky roof. When it rained, he could not stop the leak, and when it did not rain, there was no leak to be stopped!” And he frowned upon “those working men who, before prohibition, could not resist the lure of the saloon on the way home Saturday night,” which was then payday. Quite evidently, from Adam Smith in 1776 to Irving Fisher in 1930, economists were thinking about intertemporal choice with Humans in plain sight. Econs began to creep in around the time of Fisher, as he started on the theory of how Econs should behave, but it fell to a twenty-two-year-old Paul Samuelson, then in graduate school, to finish the job. Samuelson, whom many consider to be the greatest economist of the twentieth century, was a prodigy who set out to give economics a proper mathematical foundation. He enrolled at the University of Chicago at age sixteen and soon went off to Harvard for graduate school. His PhD thesis had the audacious but accurate title “Foundations of Economic Analysis.” His thesis redid all of economics, with what he considered to be proper mathematical rigor. While in graduate school in 1937, Samuelson knocked off a seven-page paper with the modest title “A Note on the Measurement of Utility.” As the title suggests, he hoped to offer a way to measure that elusive thing Econs always maximize: utility (i.e., happiness or satisfaction). While he was at it, Samuelson formulated what has become the standard economic model of intertemporal choice, the discounted utility model. I will not strain you (or myself) with any attempt to summarize the heart of this paper, but merely extract the essence our story requires. The basic idea is that consumption is worth more to you now than later. If given the choice between a great dinner this week or one a year from now, most of us would prefer the dinner sooner rather than later. Using the Samuelson formulation, we are said to “discount” future consumption at some rate. If a dinner a year from now is only considered to be 90% as good as one right now, we are said to be discounting the future dinner at an annual rate of about 10%. Samuelson’s theory did not have any passions or faulty telescopes, just steady, methodical discounting. The model was so easy to use that even economists of that generation could easily handle the math, and it remains the standard formulation today. This is not to say that Samuelson thought his theory was necessarily a good description of behavior. The last two pages of his short paper are devoted to discussing what Samuelson called the “serious limitations” of the model. Some of them are technical, but one deserves our scrutiny. Samuelson correctly notes that if people discount the future at rates that vary over time, then people may not behave consistently, that is, they may change their minds as time moves forward. The specific case he worries about is the same one that worried earlier economists such as Jevons and Pigou, namely, the case where we are most impatient for immediate rewards. To understand how discounting works, suppose there is some good, perhaps the chance to watch a tennis match at Wimbledon. If the match is watched tonight, it would be worth 100 “utils,” the arbitrary units economists use to describe levels of utility or happiness. Consider Ted, who discounts at a constant rate of 10% per year. For him that match would be worth 100 utils this year, 90 next year, then 81, 72, and so forth. Someone who discounts this way is said to be discounting with an exponential function. (If you don’t know what that term means, don’t worry about it.) Now consider Matthew, who also values that match at 100 today, but at only 70 the following year, then 63 in year three or any time after that. In other words, Matthew discounts anything that he has to wait a year to consume by 30%, the next year at 10%, and then he stops discounting at all (0%). Matthew is viewing the future by looking through Pigou’s faulty telescope, and he sees year 1 and year 2 looking just one-third of a year apart, with no real delay between any dates beyond that. His impression of the future is a lot like the famous New Yorker magazine cover “View of the World from 9th Avenue.” On the cover, looking west from 9th Avenue, the distance to 11th Avenue (two long blocks) is about as far as from 11th Avenue to Chicago, which appears to be about one third of the way to Japan. The upshot is that Matthew finds waiting most painful at the beginning, since it feels longer. FIGURE 4. View of the World from 9th Avenue. Saul Steinberg, cover of The New Yorker, March 29, 1976 © The Saul Steinberg Foundation / Artists Rights Society (ARS), New York. Cover reprinted with permission of The New Yorker magazine. All rights reserved. The technical term for discounting of this general form that starts out high and then declines is quasi-hyperbolic discounting. If you don’t know what “hyperbolic” means, that shows good judgment on your part in what words to incorporate in your vocabulary. Just keep the faulty telescope in mind as an image when the term comes up. For the most part I will avoid this term and use the modern phrase present-biased to describe preferences of this type. To see why exponential discounters stick to their plans while hyperbolic (present-biased) discounters do not, let’s consider a simple numerical example. Suppose Ted and Matthew both live in London and are avid tennis fans. Each has won a lottery offering a ticket to a match at Wimbledon, with an intertemporal twist. They can choose among three options. Option A is a ticket to a first-round match this year; in fact, the match is tomorrow. Option B is a quarterfinal match at next year’s tournament. Option C is the final, at the tournament to be held two years from now. All the tickets are guaranteed, so we can leave risk considerations out of our analysis, and Ted and Matthew have identical tastes in tennis. If the matches were all for this year’s tournament, the utilities they would assign to them are as follows: A: 100, B: 150, C: 180. But in order to go to their favorite option C, the final, they have to wait two years. What will they do? If Ted had this choice, he would choose to wait two years and go the final. He would do so because the value he puts right now on going to the final in two years (its “present value”) is 146 (81% of 180), which is greater than the present value of A (100) or B (135, or 90% of 150). Furthermore, after a year has passed, if Ted is asked whether he wants to change his mind and go to option B, the quarterfinal, he will say no, since 90% of the value of C (162) is still greater than the value of B. This is what it means to have time-consistent preferences. Ted will always stick to whatever plan he makes at the beginning, no matter what options he faces. What about Matthew? When first presented with the choice, he would also choose option C, the final. Right now he values A at 100, B at 105 (70% of 150) and C at 113 (63% of 180). But unlike Ted, when a year passes, Matthew will change his mind and switch to B, the quarterfinal, because waiting one year discounts the value of C by 70% to 126, which is less than 150, the current value of B. He is time-inconsistent. In telescope terms, referring back to the New Yorker cover, from New York he couldn’t tell that China was any farther than Japan, but if he carried that telescope to Tokyo, he would start to notice that the trip from there to Shanghai is even farther than it was from New York to Chicago. It bothered Samuelson that people might display time inconsistency. Econs should not be making plans that they will later change without any new information arriving, but Samuelson makes it clear that he is aware that such behavior exists. He talks about people taking steps equivalent to removing the bowl of cashews to ensure that their current plans will be followed. For example, he mentions purchasing whole life insurance as a compulsory savings measure. But with this caveat duly noted, he moved on and the rest of the profession followed suit. His discounted utility model with exponential discounting became the workhorse model of intertemporal choice. FIGURE 5 It may not be fair to pick this particular paper as the tipping point. For some time, economists had been moving away from the sort of folk psychology that had been common earlier, led by the Italian economist Vilfredo Pareto, who was an early participant in adding mathematical rigor to economics. But once Samuelson wrote down this model and it became widely adopted, most economists developed a malady that Kahneman calls theory-induced blindness. In their enthusiasm about incorporating their newfound mathematic rigor, they forgot all about the highly behavioral writings on intertemporal choice that had come before, even those of Irving Fisher that had appeared a mere seven years earlier. They also forgot about Samuelson’s warnings that his model might not be descriptively accurate. Exponential discounting just had to be the right model of intertemporal choice because Econs would not keep changing their minds, and the world they now studied no longer contained any Humans. This theory induced blindness now strikes nearly everyone who receives a PhD in economics. The economics training the students receive provides enormous insights into the behavior of Econs, but at the expense of losing common-sense intuition about human nature and social interactions. Graduates no longer realize that they live in a world populated by Humans. Intertemporal choice is not just an abstract concept used in theoretical economics. It plays a vital role in macroeconomics, where it underlies what is called the consumption function, which tells us how the spending of a household varies with its income. Suppose a government has seen its economy plunge into a deep recession and decides to give everyone a one-time tax cut of $1,000 per person. The consumption function tells us how much of the money will be spent and how much will be saved. Economic thinking about the consumption function changed quite dramatically between the mid-1930s and the mid-1950s. The way in which models of the consumption function evolved illustrates an interesting feature about how economic theory has developed since the Samuelson revolution began. As economists became more mathematically sophisticated, and their models incorporated those new levels of sophistication, the people they were describing evolved as well. First, Econs became smarter. Second, they cured all their self-control problems. Calculate the present value of Social Security benefits that will start twenty years from now? No problem! Stop by the tavern on the way home on payday and spend the money intended for food? Never! Econs stopped misbehaving. This pattern in the evolution in economic theory can be seen by examining the models of the consumption function proposed by three economist heavyweights: John Maynard Keynes, Milton Friedman, and Franco Modigliani. We can begin with Keynes, who famously advocated just the sort of tax cut used in this example. In his masterwork, The General Theory of Employment, Interest and Money, he proposed a very simple model for the consumption function. He assumed that if a household received some incremental income, it would consume a fixed proportion of that extra income. The term he used to describe the proportion of extra income that would be consumed is the marginal propensity to consume (MPC). Although Keynes thought that the marginal propensity to consume for a given household was relatively constant if its income did not change dramatically, he agreed with his contemporary Irving Fisher that the MPC would vary considerably across socioeconomic classes. Specifically, he thought the propensity to spend would be highest (nearly 100%) for poor families, and decline as income rises. For the rich, a windfall of $1,000 would barely affect consumption at all, so the MPC would be close to zero. If we take the case of a middle-class family that saves 5% of any additional income earned, then Keynes predicts that the MPC from a $1,000 windfall would be 95%, or $950. A couple of decades later, in a book published in 1957, Milton Friedman made the plausible observation that households might have the foresight to smooth their consumption over time, so he proposed the permanent income hypothesis. In his model, a family that is saving 5% of its income would not spend $950 extra in the year of the windfall, but instead would spread it out. Specifically, he proposed that households would use a three-year horizon to determine what their permanent income is, so would divide the extra spending evenly over the next three years. (This implies a discount rate of 33% per year.) That means that in the first year, the family would spend about $950/3, or $317.† The next move up in sophistication came from Franco Modigliani, writing with his student Richard Brumberg. Although his work was roughly contemporaneous with Friedman’s, his model was one step up the economic ladder toward the modern conception of an Econ. Rather than focus on short term periods such as a year or even three years, Modigliani based his model on an individual’s total lifetime income, and his theory was accordingly called the life-cycle hypothesis. The idea is that people would determine a plan when young about how to smooth their consumption over their lifetime, including retirement and possibly even bequests. In keeping with this lifetime orientation, Modigliani shifted his focus from income to lifetime wealth. To make things simple and concrete, let’s suppose that we are dealing with someone who knows that he will live exactly forty more years and plans to leave no bequests. With these simplifying assumptions, the life-cycle hypothesis predicts that the windfall will be consumed evenly over the next forty years, meaning that the marginal propensity to consume from the windfall will be just $25 per year ($1000/40) for the rest of his life. Notice that as we go from Keynes to Friedman to Modigliani, the economic agents are thinking further ahead and are implicitly assumed to be able to exert enough willpower to delay consumption, in Modigliani’s case, for decades. We also get wildly different predictions of the share of the windfall that will be immediately spent, from nearly all to hardly any. If we judge a model by the accuracy of its predictions, as advocated by Friedman, then in my judgment the winner among the three models’ ability to explain what people do with temporary changes to their income would be Keynes, modified somewhat in Friedman’s direction to incorporate the natural tendency to smooth out short-run fluctuations.‡ But if instead we choose models by how clever the modeler is, then Modigliani is the winner, and perhaps because economists adopted the “cleverer is better” heuristic, Modigliani’s model was declared best and became the industry standard. But it is hard to be the smartest kid in the class forever, and it is possible to take the model up one more level in sophistication, as shown by Robert Barro, an economist at Harvard. First, he assumes that parents care about the utility of their children and grandchildren, and since those descendants will care about their own grandchildren, their time horizon is effectively forever. So Barro’s agents plan to give bequests to their heirs, and realize that their heirs will do likewise. In this world, the predictions about how much money will be spent depend on from where the money comes. If the $1,000 windfall had come from a lucky night at the casino, Barro would make the same prediction as Modigliani about consumption. But if the windfall is a temporary tax cut that is financed by issuing government bonds, then Barro’s prediction changes. The bonds will have to be repaid eventually. The beneficiary of the tax cut understands all this, and realizes that his heir’s taxes will eventually have to go up to pay for the tax cut he is receiving, so he won’t spend any of it. Instead he will increase his bequests by exactly the amount of the tax cut. Barro’s insight is ingenious, but for it to be descriptively accurate we need Econs that are as smart as Barro.§ Where should one stop this analysis? If someone even more brilliant than Barro comes along and thinks of an even smarter way for people to behave, should that too become our latest model of how real people behave? For example, suppose one of Barro’s agents is a closet Keynesian, an idea that Barro would abhor, and he thinks that the tax cut will stimulate the economy enough to pay off the bonds from increased tax revenues; in that case, he will not need to alter his planned bequests. In fact, if the tax cut stimulates the economy enough, he might even be able to reduce his bequests because his heirs will be the beneficiaries of the higher economic growth rate. But notice now we need Econs who are fully conversant with both economic theory and the relevant empirical tests of effects of fiscal policy in order to know which model of the economy to incorporate in their thinking. Clearly, there must be limits to the knowledge and willpower we assume describe the agents in the economy, few of whom are as clever as Robert Barro. The idea of modeling the world as if it consisted of a nation of Econs who all have PhDs in economics is not the way psychologists would think about the problem. This was brought home to me when I gave a talk in the Cornell psychology department. I began my talk by sketching Modigliani’s life-cycle hypothesis. My description was straightforward, but to judge from the audience reaction, you would have thought this theory of savings was hilarious. Fortunately, the economist Bob Frank was there. When the bedlam subsided, he assured everyone that I had not made anything up. The psychologists remained stunned in disbelief, wondering how their economics department colleagues could have such wacky views of human behavior.¶ Modigliani’s life-cycle hypothesis, in which people decide how much of their lifetime wealth to consume each period, does not just assume that people are smart enough to make all the necessary calculations (with rational expectations) about how much they will make, how long they will live, and so forth, but also that they have enough self-control to implement the resulting optimal plan. There is an additional unstated assumption: namely, that wealth is fungible. In the model, it does not matter whether the wealth is held in cash, home equity, a retirement plan, or an heirloom painting passed on from a prior generation. Wealth is wealth. We know from the previous chapters on mental accounting that this assumption is no more innocuous or accurate than the assumptions about cognitive abilities and willpower. To relax the assumption that wealth is fungible and incorporate mental accounting into a theory of consumption and savings behavior, Hersh Shefrin and I proposed what we called the behavioral life-cycle hypothesis. We assume that a household’s consumption in a given year will not depend just on its lifetime wealth, but also on the mental accounts in which that wealth is held. The marginal propensity to consume from winning $1,000 in a lottery is likely to be much higher than a similar increase in the value of a household’s retirement holdings. In fact, one study has found that the MPC from an increase in the value of retirement saving can even be negative! Specifically, a team of behavioral economists showed that when investors in retirement plans earn high returns, making them richer, they increase their saving rates, most likely because they extrapolate this investment success into the future. To understand the consumption behavior of households, we clearly need to get back to studying Humans rather than Econs. Humans do not have the brains of Einstein (or Barro), nor do they have the self-control of an ascetic Buddhist monk. Rather, they have passions, faulty telescopes, treat various pots of wealth quite differently, and can be influenced by short-run returns in the stock market. We need a model of these kinds of Humans. My favorite version of such a model is the subject of the next chapter.