Molecular Biology: Introduction to the Cell

Cells and Genomes

The surface of our planet is populated by living things—curious, intricately organized chemical factories that take in matter from their surroundings and use these raw materials to generate copies of themselves. These living organisms appear extraordinarily diverse. What could be more different than a tiger and a piece of seaweed, or a bacterium and a tree? Yet our ancestors, knowing nothing of cells or DNA, saw that all these things had something in common. They called that some- thing “life,” marveled at it, struggled to define it, and despaired of explaining what it was or how it worked in terms that relate to nonliving matter. The discoveries of the past century have not diminished the marvel—quite the contrary. But they have removed the central mystery regarding the nature of life. We can now see that all living things are made of cells: small, membrane-enclosed units filled with a concentrated aqueous solution of chemicals and endowed with the extraordinary ability to create copies of themselves by growing and then dividing in two.

Because cells are the fundamental units of life, it is to cell biology—the study of the structure, function, and behavior of cells—that we must look for answers to the questions of what life is and how it works. With a deeper understanding of cells and their evolution, we can begin to tackle the grand historical problems of life on Earth: its mysterious origins, its stunning diversity, and its invasion of every conceivable habitat. Indeed, as emphasized long ago by the pioneering cell biolo-

gist E. B. Wilson, “the key to every biological problem must finally be sought in the cell; for every living organism is, or at some time has been, a cell.”

Despite their apparent diversity, living things are fundamentally similar inside. The whole of biology is thus a counterpoint between two themes: astonishing variety in individual particulars; astonishing constancy in fundamental mechanisms. In this first chapter, we begin by outlining the universal features common

to all life on our planet. We then survey, briefly, the diversity of cells. And we see how, thanks to the common molecular code in which the specifications for all living organisms are written, it is possible to read, measure, and decipher these specifications to help us achieve a coherent understanding of all the forms of life, from the smallest to the greatest.

The Universal Features of Cells on Earth

It is estimated that there are more than 10 million—perhaps 100 million—living species on Earth today. Each species is different, and each reproduces itself faith- fully, yielding progeny that belong to the same species: the parent organism hands down information specifying, in extraordinary detail, the characteristics that the offspring shall have. This phenomenon of heredity is central to the definition of life: it distinguishes life from other processes, such as the growth of a crystal, or the burning of a candle, or the formation of waves on water, in which orderly structures are generated but without the same type of link between the peculiarities of parents and the peculiarities of offspring. Like the candle flame, the living organism must consume free energy to create and maintain its organization. But life employs the free energy to drive a hugely complex system of chemical processes that are specified by hereditary information.

Most living organisms are single cells. Others, such as ourselves, are vast multicellular cities in which groups of cells perform specialized functions linked by intricate systems of communication. But even for the aggregate of more than 1013 cells that form a human body, the whole organism has been generated by cell divisions from a single cell. The single cell, therefore, is the vehicle for all of the hereditary information that defines each species. This cell includes the machinery to gather raw materials from the environment and to construct from them a new cell in its own image, complete with a new copy of its hereditary information. Each and every cell is truly amazing.

All Cells Store Their Hereditary Information in the Same Linear

Chemical Code: DNA

Computers have made us familiar with the concept of information as a measurable quantity—a million bytes (to record a few hundred pages of text or an image from a digital camera), 600 million bytes for the music on a CD, and so on. Computers have also made us well aware that the same information can be recorded in many different physical forms: the discs and tapes that we used 20 years ago for our electronic archives have become unreadable on present-day machines. Living cells, like computers, store information, and it is estimated that they have been

evolving and diversifying for over 3.5 billion years. It is scarcely to be expected that they would all store their information in the same form, or that the archives of one type of cell should be readable by the information-handling machinery of another. And yet it is so. All living cells on Earth store their hereditary information in the form of double-stranded molecules of DNA—long, unbranched, paired polymer chains, formed always of the same four types of monomers. These monomers, chemical compounds known as nucleotides, have nicknames drawn from a four-letter alphabet—A, T, C, G—and they are strung together in a long linear sequence that encodes the genetic information, just as the sequence of 1s and 0s encodes the information in a computer file. We can take a piece of DNA from a human cell and insert it into a bacterium, or a piece of bacterial DNA and insert it into a human cell, and the information will be successfully read, interpreted, and copied. Using chemical methods, scientists have learned how to read out the complete sequence of monomers in any DNA molecule—extending for many millions of nucleotides—and thereby decipher all of the hereditary information that each organism contains.

All Cells Replicate Their Hereditary Information by Templated

Polymerization

The mechanisms that make life possible depend on the structure of the double-stranded DNA molecule. Each monomer in a single DNA strand—that is, each nucleotide—consists of two parts: a sugar (deoxyribose) with a phosphate group attached to it, and a base, which may be either adenine (A), guanine (G), cytosine (C), or thymine (T) (Figure 1–2). Each sugar is linked to the next via the phosphate group, creating a polymer chain composed of a repetitive sugar-phosphate backbone with a series of bases protruding from it. The DNA polymer is extended by adding monomers at one end. For a single isolated strand, these monomers can, in principle, be added in any order, because each one links to the next in the same way, through the part of the molecule that is the same for all of them. In the living cell, however, DNA is not synthesized as a free strand in isolation, but on a template formed by a preexisting DNA strand. The bases protruding from the existing strand bind to bases of the strand being synthesized, according to a strict rule defined by the complementary structures of the bases: A binds to T, and C binds to G. This base-pairing holds fresh monomers in place and thereby controls the selection of which one of the four monomers shall be added to the growing strand next. In this way, a double-stranded structure is created, consisting of two exactly complementary sequences of As, Cs, Ts, and Gs. The two strands twist around each other, forming a DNA double helix (Figure 1–2E).

The bonds between the base pairs are weak compared with the sugar-phosphate links, and this allows the two DNA strands to be pulled apart without breakage of their backbones. Each strand then can serve as a template, in the way just described, for the synthesis of a fresh DNA strand complementary to itself—a fresh copy, that is, of the hereditary information (Figure 1–3). In different types of cells, this process of DNA replication occurs at different rates, with different controls to start it or stop it, and different auxiliary molecules to help it along. But the basics are universal: DNA is the information store for heredity, and templated polymerization is the way in which this information is copied throughout the living world.

All Cells Transcribe Portions of Their Hereditary Information into

the Same Intermediary Form: RNA

To carry out its information-bearing function, DNA must do more than copy itself. It must also express its information, by letting the information guide the synthesis of other molecules in the cell. This expression occurs by a mechanism that is the same in all living organisms, leading first and foremost to the production of two other key classes of polymers: RNAs and proteins. The process (discussed in detail in Chapters 6 and 7) begins with a templated polymerization called transcription, in which segments of the DNA sequence are used as templates for the synthesis of shorter molecules of the closely related polymer ribonucleic acid, or RNA.

Later, in the more complex process of translation, many of these RNA molecules direct the synthesis of polymers of a radically different chemical class—the protein.

In RNA, the backbone is formed of a slightly different sugar from that of DNA—ribose instead of deoxyribose—and one of the four bases is slightly different—ura- cil (U) in place of thymine (T). But the other three bases—A, C, and G—are the same, and all four bases pair with their complementary counterparts in DNA—the A, U, C, and G of RNA with the T, A, G, and C of DNA. During transcription, the RNA monomers are lined up and selected for polymerization on a template strand of DNA, just as DNA monomers are selected during replication. The outcome is a polymer molecule whose sequence of nucleotides faithfully represents a portion of the cell’s genetic information, even though it is written in a slightly different alphabet—consisting of RNA monomers instead of DNA monomers.

The same segment of DNA can be used repeatedly to guide the synthesis of many identical RNA molecules. Thus, whereas the cell’s archive of genetic information in the form of DNA is fixed and sacrosanct, these RNA transcripts are mass-produced and disposable. As we shall see, these transcripts

function as intermediates in the transfer of genetic information. Most notably, they serve as messenger RNA (mRNA) molecules that guide the synthesis of proteins according to the genetic instructions stored in the DNA.

RNA molecules have distinctive structures that can also give them other specialized chemical capabilities. Being single-stranded, their backbone is flexible, so that the polymer chain can bend back on itself to allow one part of the molecule to form weak bonds with another part of the same molecule. This occurs when segments of the sequence are locally complementary: a …GGGG… segment, for example, will tend to associate with a …CCCC… segment. These types of internal associations can cause an RNA chain to fold up into a specific shape that is dictated by its sequence. The shape of the RNA molecule, in turn, may enable it to recognize other molecules by binding to them selectively—and even, in certain cases, to catalyze chemical changes in the molecules that are bound. In fact, some chemical reactions catalyzed by RNA molecules are crucial for several of the most ancient and fundamental processes in living cells, and it has been suggested that an extensive catalysis by RNA played a central part in the early evolution of life.