GM

Review Flashcards

RNA-Based Gene Regulation

mRNA Secondary Structures

RNA molecules can fold into secondary structures, which can regulate gene expression. Examples include:

  • Terminator sequence in the 5' UTR of the trp operon (attenuation).
  • Riboswitches.

Some structures can be unwound by helicases, while others directly influence translation or transcription.

Riboswitches

  • Found in bacteria, archaea, fungi, and plants (not animals).
  • Located in the 5' UTR of an mRNA.
  • Bind a ligand (regulatory molecule like ions, nucleotides, or amino acids).
  • Upon binding, the structure of the mRNA changes (refolds), affecting:
    • Transcription: May cause premature termination (in prokaryotes).
    • Translation: May hide or expose the ribosome binding site (in prokaryotes and eukaryotes).
  • Riboswitches contain an aptamer domain that binds the ligand.

Antisense RNA (asRNA)

  • Long non-coding RNAs that are complementary to a target mRNA.
  • Bind to mRNA to prevent translation (can span 50–1000+ nucleotides).
  • May originate from another gene or the non-template strand of the target gene.
  • Found in both prokaryotes and eukaryotes.

RNA Interference (RNAi)

  • Evolved as a defense against double-stranded RNA viruses.
  • Key enzymes and complexes:
    • Dicer: Cuts dsRNA into small interfering RNAs (siRNAs) or microRNAs (miRNAs).
    • RISC (RNA-induced silencing complex): Uses siRNAs/miRNAs to bind complementary mRNA.
      • Binding results in mRNA degradation or translation inhibition.

miRNA (microRNA)

  • ~21–25 nucleotides long, single-stranded RNA.
  • Transcribed from different genes (not from the target).
  • Binds to mRNA imperfectly, blocking translation without degradation.
  • Functions in gene regulation.

siRNA (small interfering RNA)

  • ~21–25 nucleotides, derived from the target gene itself.
  • Binds perfectly to the target mRNA.
  • Causes mRNA degradation, leading to no translation.
  • Can also act via RITS (RNA-induced transcriptional silencing) to:
    • Recruit methyltransferase enzymes.
    • Methylate DNA → epigenetic silencing.

Experimental RNAi

  • Scientists can artificially introduce dsRNA into cells or tissues.
  • Dicer processes this RNA, which activates the RNAi pathway.
  • Used for targeted gene knockdown in research or medicine.
  • Effects are temporary and localized to the treated area.

Long Noncoding RNA (lncRNA)

  • Transcripts ≥200 nucleotides (can be over 10 kb).
  • Can bind to DNA, RNA, or proteins to regulate gene expression.
  • Example: Xist (X-inactive specific transcript).
    • Coats one X chromosome in females.
    • Leads to heterochromatin formation → X inactivation.
    • Inactivation is random (Lyon hypothesis) and happens in early development.
    • Explains mosaic expression of X-linked genes.

Chromosomal Variants and Rearrangements

General Concepts

  • Chromosomal variants are larger-scale mutations compared to point mutations.
  • Can be:
    • Balanced (no net gain/loss of genetic info).
    • Unbalanced (deletion/duplication = gene dosage change).
  • Common forms include duplications, deletions, inversions, and translocations.

Duplication

  • A chromosomal segment is copied and inserted.
  • Types:
    • Tandem: adjacent to the original.
    • Dispersed: elsewhere on the chromosome or genome.
  • Can create copy number variants (CNVs) and paralogous genes.
  • Increases gene dosage, which may alter phenotype.
  • Paralogs: duplicated genes within a species that evolve new functions.

Deletion

  • Loss of a chromosome segment.
  • May be:
    • Visible on a karyotype if large.
    • Lethal if homozygous.
  • May expose recessive alleles in heterozygotes (pseudodominance).

Inversion

  • A chromosome segment is flipped 180°.
  • Types:
    • Paracentric: does not include centromere.
    • Pericentric: includes centromere.
  • In heterozygotes:
    • Crossing over within the inversion can produce:
      • Unbalanced gametes (deletions/duplications).
      • Infertility (3–5% of infertile couples carry inversions).

Translocation

  • Movement of a segment to a nonhomologous chromosome.
  • Types:
    • Reciprocal: exchange between chromosomes.
    • Non-reciprocal: one-way movement.
  • Can create complex pairing during meiosis, leading to unbalanced gametes.
  • Robertsonian translocation: occurs at or near centromeres of acrocentric chromosomes (common in humans).

Mechanisms of Rearrangement

  • Chromosomal breaks and faulty repair.
  • Unequal crossing over.
  • Transposable elements (TEs):
    • DNA transposons: cut and paste via transposase.
    • Retrotransposons: copy and paste via reverse transcription.

Polyploidy

Autopolyploidy

  • Multiple sets of chromosomes from one species (e.g., 3N, 4N).
  • Common in plants; rare and often lethal in mammals.
  • Leads to:
    • Larger phenotypes due to gene dosage.
    • Sterility, especially in triploids.

Allopolyploidy

  • Combines chromosome sets from two or more species.
  • Initial hybrids are often sterile but may become fertile if chromosome pairing is preserved.
  • Requires similar chromosome number and gene order (synteny).

Genomics

Genomics is the study of the content, organization, function, and evolution of genetic material across entire genomes.

Types of Genomics

  • Structural Genomics: Focuses on the sequence and arrangement of genes within a genome.
    • Key questions: What is the full sequence of a genome? Where are genes located, and how are they arranged?
    • Methods: DNA sequencing (e.g., Illumina), bioinformatics for assembly and annotation.
    • Products: Assembled genomes, gene annotations.
  • Functional Genomics: Explores how genetic variation influences phenotypic traits.
    • Key questions: What proteins and RNAs are encoded by genes? When and where are they expressed?
    • Methods: SNP analysis, RNA expression profiling, GWAS, statistical comparisons.
    • Products: Candidate genes linked to traits for further testing.
  • Comparative Genomics: Compares gene content and structure across species to understand evolutionary changes.
    • Looks at similarities and differences in genes and their organization among organisms.

Genome Assembly

A genome assembly is the most current version of the entire genome sequence of an organism.

Steps in Genome Assembly

  1. Shotgun Sequencing:
    • Tissue is collected, DNA is extracted, and fragmented.
    • Short DNA pieces are sequenced using high-throughput technologies like Illumina.
  2. Alignment of Reads:
    • Short reads are aligned using sequence overlaps to build contigs (continuous sequences).
    • Regions with high read depth (number of times a base is sequenced) have greater alignment confidence.
    • Repetitive sequences are difficult to align due to similar overlapping regions.
  3. Scaffold Construction:
    • Contigs are aligned to known genetic markers and mapped onto chromosomes.
    • Gaps in scaffold sequences are often denoted with “N”s.

Gene Annotation

Gene annotation identifies and describes genes and functional regions in the genome.

Ab Initio Prediction

  • Uses bioinformatics to search for necessary gene components:
    • Open reading frames (ORFs): sequences starting with AUG and ending with TAA, TAG, or TGA, capable of coding proteins.
    • Regulatory motifs:
      • Prokaryotes: promoters, terminators, Shine-Dalgarno sequences.
      • Eukaryotes: TATA box, regulatory promoters, splice site signals (5' and 3'), poly-A signals, and CpG islands (often at gene 5' ends).

Homology-Based Annotation

  • Uses known expressed genes or protein domains to predict new gene functions.
  • Aligns genome sequence to mRNA/cDNA (e.g., from RNA-seq).
  • Identifies only expressed genes in a given tissue.
  • Can also detect conserved protein domains (e.g., zinc finger = DNA-binding protein).

Functional Genomics

Functional genomics identifies genetic variants associated with phenotypes.

Key Goals

  • Associate traits (phenotypes) with specific genetic variants.
  • Understand gene expression: what genes are turned on/off, when, where, and how much.

How Trait Association Works

  1. Choose contrasting groups (differ by phenotype):
    • Family-based studies: within the same lineage.
    • GWAS (Genome-Wide Association Studies): compare unrelated individuals.
    • Quantitative breeding experiments: cross individuals with trait differences to study offspring.
  2. Characterize Genetic Variation:
    • Study SNPs (single nucleotide polymorphisms).
    • SNPs can be inherited or arise from mutations.
    • Used as markers, even if not causative.
  3. Use Statistics to Find Associations:
    • Determine if a SNP is significantly more common in affected individuals.
    • Represented using Manhattan plots:
      • X-axis: SNP position on genome.
      • Y-axis: significance of association.
      • SNPs in close proximity may be associated due to linkage (inherited together).

Technologies to Detect Genetic Variants & Expression

DNA Microarrays (DNA Chips)

  • Detect an individual's SNPs at thousands of known locations.
  • Method:
    • Single-stranded DNA probes fixed on a glass slide.
    • Sample DNA is fragmented, amplified, denatured, and hybridized to probes.
    • Fluorescently labeled nucleotides bind to reveal SNP identity.
  • Example: Affymetrix 6.0 DNA chip (906,600 SNPs on autosomes, X/Y, mitochondria).

Genomic Resequencing

  • No prior SNP knowledge needed.
  • Whole-genome sequencing of an individual using Illumina.
  • Align to a reference genome to find new SNPs and CNVs (Copy Number Variants) based on read depth.

RNA Microarrays

  • Measures gene expression levels across samples.
  • Steps:
    • Extract RNA → reverse transcribe to cDNA.
    • cDNA labeled and hybridized to probes.
    • Fluorescent signal reflects transcription level.
  • Competitive hybridization can be used to compare expression between two samples.

RNA Sequencing (RNA-seq)

  • Provides a quantitative and comprehensive look at transcription.
  • Steps:
    • RNA → cDNA via reverse transcription.
    • cDNA sequenced via Illumina.
    • Aligned to genome:
      • Shows which genes are expressed.
      • Reveals alternative splicing and exon usage.
      • Read depth = expression level.

Visualizing Gene Expression

Volcano Plots

  • Each point = a gene.
  • X-axis: log-fold change in expression between groups.
  • Y-axis: statistical significance.
  • Shows both magnitude and reliability of gene expression differences.

Heat Maps

  • Grid format with color-coding of expression intensity.
  • Columns: samples (grouped by condition).
  • Rows: genes.
  • Color scale reflects expression level per gene per sample.
  • Useful for detecting expression patterns and clusters.

What Comes After Identifying Trait-Associated Variants

  • Correlation ≠ Causation → test experimentally.

Experimental Validation Techniques

  • CRISPR-Cas9: Edit SNP or sequence to test its role in phenotype.
  • RNAi: Temporarily silence a gene to observe trait change.
  • Transgenic/Mutant Models: Introduce mutations or genes into organisms (e.g., mice, cell lines) to assess effect on phenotype.

Genomics in Clinical Medicine

  • Clinical genetic testing uses trait-associated variants to inform healthcare decisions.

Applications of Clinical Genomics

  • Diagnosis of disease or identifying genetic basis of symptoms.
  • Determining severity of conditions.
  • Personalized medicine: Matching patients with treatments based on genetic profile.
  • Risk prediction: Identifying individuals at increased risk for future disease.

Caveats in Genomic Medicine

  • Most studies overrepresent European descent.
  • Lack of diversity limits the ability to generalize findings to other populations.
  • Important because allele frequencies, gene-environment interactions, and heterogeneity differ across populations.

Genetic Variation and Evolution

  • Changes to genetic information (mutations) produce variation.
  • Mutations accumulated across generations underlie current genetic diversity.
  • Genetic variation patterns can be used to reconstruct evolutionary relationships.

Influence of Genetic Transmission, Expression, and Change on Populations

  • Genetic study of populations involves:
    • Tracking allele frequencies
    • Understanding speciation
    • Using genetics to explore evolutionary relationships

Population Genetics

  • Population genetics studies how genetic composition changes over time in response to evolutionary forces.
  • A population is a group of individuals of the same species capable of interbreeding.
    • Populations may be physically isolated or may exist across a continuous distribution.

Genotype and Allele Frequencies

  • Used to characterize genetic composition of populations.
  • Comparing frequencies between populations allows detection of population differentiation.
  • Tracking changes over generations helps detect evolution.

Quantifying Genetic Frequencies

Genotypic Frequency

  • Determined from data when genotypes can be distinguished (e.g., codominance or using DNA sequences).

Allelic Frequency

  • Can be calculated from observed genotypes.

Hardy-Weinberg Equilibrium (HWE)

A model that assumes an idealized population to predict genotype frequencies from allele frequencies.

Assumptions of HWE

  • No migration
  • Random mating
  • No natural selection
  • No mutation
  • Infinitely large population

HWE Equations

  • Allele frequency: p + q = 1
  • Genotype frequencies: p^2 + 2pq + q^2 = 1
    • p^2 = frequency of homozygous dominant genotype
    • 2pq = frequency of heterozygous genotype
    • q^2 = frequency of homozygous recessive genotype

Determining Whether a Population Is Evolving

  1. Define a null hypothesis: genotype frequencies are as predicted under HWE.
  2. Compare observed genotype counts with HWE expectations.
  3. Significant differences suggest evolutionary change.

Deviation from HWE in Real Populations

  • Real populations may not meet all assumptions of HWE.
  • Genomic analysis (e.g., in a Japanese population) showed that only 0.63% of SNPs deviated significantly from HWE.

Processes That Cause Deviation from HWE

Each factor represents a real-world evolutionary force that changes allele or genotype frequencies.

Non-Random Mating

  • Mating is not random with respect to genotype at a given locus. Types:
    • Assortative mating: individuals mate with others of similar phenotype/genotype.
    • Disassortative mating: individuals mate with those of dissimilar phenotype/genotype.
    • Inbreeding: mating between related individuals (a specific case of assortative mating).

Coefficient of Inbreeding (F)

  • Also called the fixation index.
  • Represents the probability that two alleles are identical by descent.
  • F ranges from –1 to 1

Genetic Drift

  • Random changes in allele frequencies due to sampling error in small populations.
  • More pronounced in smaller populations.
  • Founder events and bottleneck events are major causes of drift.

Natural Selection

  • Occurs when individuals with different phenotypes have different survival or reproductive success. Key formulas:
    • Relative Fitness (W): W = \frac{\text{avg. number of offspring of genotype}}{\text{avg. number of offspring of most fit genotype}}
      • Values range from 0 to 1
      • Higher W = greater fitness
    • Selection Coefficient (s): s = 1 - W
      • Measures strength of selection against a genotype
      • Values range from 0 to 1
      • Higher s = stronger selection

Migration (Gene Flow)

  • Movement of alleles between populations. Effects:
    • Alters allele frequencies.
    • Prevents genetic divergence between populations.
    • Increases genetic variation within populations.
  • Absence of migration can lead to speciation (genetic isolation).
  • Restoration of gene flow may:
    • Reduce population uniqueness
    • Help spread beneficial alleles (e.g., disease resistance)

Mutation

  • A change in DNA that results in a new allele.
  • Key points:
    • Mutation is the ultimate source of all genetic variation.
    • Can be:
      • Beneficial
      • Neutral
      • Deleterious

Mutation frequency (μ):

  • Probability that an allele is altered by a new mutation.
  • Typical range: \mu = 10^{-5} \text{ to } 10^{-6} \text{ per gene per generation}
    • Especially common for loss-of-function mutations