History and Methodologies of DNA and RNA Sequencing

Historical Milestones in DNA Sequencing

DNA sequencing began in the 1970s following the 1953 discovery of the DNA double helix by James Watson and Francis Crick. In 1965, the first nucleic acid molecule to be sequenced was the $\text{Escherichia coli}$ alanine tRNA. Technologic advancements continued with Hamilton Smith's 1970 discovery of type II restriction enzymes and the 1983 development of the Polymerase chain reaction (PCR) by Kary B. Mullis. The two foundational sequencing methods, Maxam-Gilbert and Sanger sequencing, were both established between 1975 and 1977. Frederick Sanger was awarded the Nobel Prize in 1980 for his contributions to the field.

Core Principles and Methodology

DNA sequencing is the process of determining the precise order of the four nucleotide bases: adenine, guanine, cytosine, and thymine. Because current technology cannot read an entire chromosome from end to end, the DNA is cut into smaller fragments to serve as templates. These fragments, differing in length by a single base, are separated by size to effectively recreate the original sequence. This methodology is foundational for a cDNA library where mRNA is isolated and converted via reverse transcriptase into DNA, which is then inserted into bacterial plasmids for purification and sequencing.

The Sanger (Chain Termination) Method

Sanger sequencing, or the chain termination method, involves the in-vitro synthesis of DNA using terminators known as di-deoxynucleotides (ddNTPs). These ddNTPs, such as ddATP or ddGTP, lack the required $3'-OH$ group necessary for forming phosphodiester bonds, causing the growing DNA strand to terminate at specific sites. The procedure involves four main steps: denaturation of the DNA into single strands using heat, primer attachment and base extension, termination by ddNTPs, and gel electrophoresis. The resulting bands are analyzed to deduce the sequence of the unknown DNA strand.

The Maxam-Gilbert Sequencing Method

Developed by Allan Maxam and Walter Gilbert, this method is based on chemical modification and subsequent cleavage of DNA at specific nitrogenous bases. The process requires a purified DNA fragment labeled with radioactive material, specifically $^{32}P$ at the $5'$ end. Specific chemical treatments generate breaks: dimethyl sulfate targets guanines, formic acid targets purines ( $A+G$ ), hydrazine targets pyrimidines ( $C+T$ ), and hydrazine with NaCl targets cytosine. The fragments are separated by size through gel electrophoresis and visualized using autoradiography on X-ray film to infer the sequence.

RNA Sequencing (RNA-Seq)

RNA-Seq is a next-generation sequencing (NGS) technique used to analyze the transcriptome. This process measures cellular functions and disease states by converting extracted RNA into cDNA, which is then fragmented and sequenced using platforms like Illumina. Key types include total RNA sequencing, mRNA sequencing, and targeted RNA-Seq. This method offers high sensitivity for detecting low-abundance transcripts and high resolution for identifying alternative splicing, gene fusions, and novel transcripts without requiring prior sequence knowledge.

Applications and Practical Considerations

Sequencing and comparative DNA studies are critical for detecting mutations, completing the Human Genome Project, and improving agriculture through microbial mapping. In forensics, unique DNA patterns from blood, skin, or hair are used to identify individuals or criminals. In medicine, sequencing identifies genes associated with hereditary or acquired diseases, facilitating gene therapy. While highly effective, DNA analysis presents concerns regarding the invasion of individual privacy and potential discrimination based on ethnic background or parentage. Additionally, some methods may result in incomplete coverage or fail to detect balanced translocations and inversions.