16S Ribosomal RNA Amplicon Sequencing
Overview of the Ribosomal RNA () Gene
Carl Woese and his colleagues were the first to describe bacterial ribosomal genes as "molecular clocks."
These genes are considered molecular clocks because of several uncommon features: - Universality across the bacterial domain. - Functional activity and essential cellular functions. - Extremely conserved structure and nucleotide sequence.
There are three types of ribosomal RNA in prokaryotic ribosomes, classified by their sedimentation rates: - : Sequenced length of approximately nucleotides. - : Sequenced length of approximately nucleotides. - : Sequenced length of approximately nucleotides.
The gene is the standard for bacterial taxonomic classification because it is rapidly and easily sequenced while providing sufficient phylogenetic information.
Structure of the gene: - It consists of highly conserved regions. - It contains hypervariable regions across the bacterial domain. - Conservation levels vary: more conserved regions correlate to higher-level taxonomy, while less conserved (variable) regions correlate to lower levels such as genus and species.
Taxonomic Identification: - Sequence similarity in the gene is the gold standard for species-level identification. - A sequence divergence range of to is typically used to delineate the species taxonomic rank.
Advantages of sequencing
The gene is universally distributed among all bacteria.
The abundance of available sequences significantly exceeds that of any other bacterial genes, facilitating easier comparison and analysis.
It provides a reliable metric for measuring phylogenetic relationships across different taxa.
Horizontal gene transfer (HGT) is not considered a significant problem for this gene, ensuring the phylogenetic signal remains linked to the organism's lineage.
The costs associated with performing gene amplification and sequencing are currently very affordable.
Disadvantages and Limitations of Sequencing
Copy numbers per genome can vary; while usually taxon-specific, variation among different strains of the same species is possible.
Polymerase Chain Reaction (PCR) amplification biases can occur during library preparation.
Gene diversity within a sample tends to over-inflate overall diversity estimates.
Resolution is often too low to differentiate between very closely related species.
Evolution of the field: As sequencing costs continue to drop, microbiome research is shifting away from sequencing toward more comprehensive functional representations via whole-genome or shotgun metagenomics sequencing.
Workflow for Sequencing
A complete workflow typically includes four main stages: - DNA isolation. - Library preparation. - Sequencing. - Data analysis.
Following DNA isolation, the DNA is selectively amplified using PCR with primers specifically targeting the gene.
Sequencing Platforms and Primer Selection
Next-Generation Sequencing (NGS) constraints: - Common NGS platforms usually cover to base pairs (bp) per single read. - Because the full-length gene is approximately bp, primers are often chosen to target only a portion of the gene.
Full-Length Sequencing: - The full-length gene is usually amplified using the primer pair 27F and 1492R. - Full-length sequencing is followed by either Sanger DNA sequencing or Pacific Biosciences (PacBio) SMRT sequencing.
High-Throughput Sequencing: - Various high-throughput platforms sequence different lengths of DNA, requiring a suitable pair of PCR primers for each specific system. - Region to : Identified as the most useful for distinguishing species within the clinically important and ubiquitous skin bacterial genus Staphylococcus. Consequently, this region is standard for skin microbiome studies. - Illumina MiSeq: When using this platform, the and regions are commonly amplified using limited-cycle PCR. - Other technologies used include pyrosequencing (targeting specific regions or linked to barcodes) and large-scale clonal Sanger sequencing.
Library Preparation and Data Analysis
Library Preparation Steps: - PCR products are purified, quantified, and pooled. - Illumina sequencing adapters and dual-index barcodes are added to the amplicon targets. - Using the full complement of Nextera XT indices, up to libraries can be pooled together for a single sequencing run.
Data Analysis Steps: - Raw sequences are filtered and trimmed to maintain high quality. - High-quality sequences are clustered into Operational Taxonomic Units (OTUs). - OTU clustering is commonly based on a identity threshold of the reads. - Determining OTUs allows for subsequent species annotation, OTU phylogeny, diversity analysis, and other downstream comparative studies.