Genetics: Gene Proper Region, Regulation, ORFs, Mutations, and Inheritance (Comprehensive Notes)
Gene proper region, genomic locus, and regulation
- Genome contains ~22,000 genomic loci. Each locus has a gene proper region that encodes a protein, plus regulatory switches (cis-regulatory elements) that control when, where, and how much protein is made.
- The gene proper region specifies the exact amino acid sequence of the protein; regulatory switches determine expression levels and cell-type specificity.
- If expression is mistuned (too much or too little), phenotypic problems and many diseases can arise; regulation is the “Goldilocks” amount of protein production.
- Some proteins are expressed only during embryonic development; others are produced continuously (e.g., certain blood proteins are needed throughout life).
- A genomic locus with normal function has proper DNA sequence in the gene proper region plus properly functioning regulatory switches; disruptions can lead to abnormal function.
- Reminder: mutation refers to changes in the DNA sequence (DNA level). In advanced molecular biology, mRNA mutations can occur, but for this class we assume mutations are at the DNA level.
- The DNA sequence determines the mRNA sequence (through transcription with uracil replacing thymine), and the mRNA sequence determines the amino acid sequence of the protein.
- In eukaryotes, the gene proper region consists of exons (coding DNA) and introns (non-coding DNA) in the template strand. The template contains both exons and introns; the primary mRNA transcript includes both, but introns are spliced out to yield the mature mRNA with just the exons.
- Open Reading Frame (ORF) is the portion of the mRNA that will be translated into protein; it begins at the first start codon and continues in non-overlapping triplets (codons).
Open Reading Frame and translation basics
- A codon is a triplet of nucleotides; each codon encodes one amino acid.
- Codons are read in sets of three nucleotides, starting with ATG in DNA (AUG in mRNA).
- The ORF starts at the first ATG (AUG) and proceeds three nucleotides at a time to produce amino acids.
- Example: If the ORF contains N nucleotides, the number of amino acids produced is 3N (assuming there is no stop codon truncation within the ORF). For instance, if there are 1200 nucleotides, the protein would be 31200=400 amino acids.
- The sequence of triplets is non-overlapping and in-frame; as long as the ORF is intact, the full-length protein can be produced.
- A mutation that disrupts the ORF by changing the reading frame is called a frame shift mutation.
Frameshift mutations and reading frame integrity
- Frameshift mutations occur when nucleotides are inserted or deleted not in multiples of three. This shifts the reading frame and changes all downstream amino acids; often yields a nonfunctional protein.
- If an insertion or deletion is a multiple of three, the reading frame is preserved but the protein length can increase or decrease by whole amino acids, potentially altering function but not necessarily abolishing it.
- Frame shift mutations are a common cause of loss of function in the affected gene.
- Visual takeaway: frameshifts derail the triplet codon structure downstream of the mutation, producing incorrect amino acids and potentially premature stop codons.
Gene proper region vs regulatory switches; cis-regulatory elements
- The genomic locus includes both the gene proper region (protein-coding DNA) and regulatory switches (cis-regulatory elements) that modulate transcription.
- Cis-regulatory elements are typically short DNA sequences (often ~50 nucleotides) that contain transcription factor binding sites.
- These switches regulate how much transcript is produced (transcription level) and in which cell types, contributing to tissue-specific expression.
- All cells in the body share the same genome, but different regulatory switches across loci yield different expression patterns across cell types.
- Some loci are active in all cell types; others are active only in a subset of cells.
Alleles, diploidy, and homologous chromosomes
- Humans are diploid: we have 46 chromosomes, organized into 23 pairs. 46 chromosomes in total, 23 pairs.
- One chromosome of each pair is inherited from each parent; these are homologous chromosomes.
- A specific locus on the maternal chromosome and the corresponding locus on the paternal chromosome are alleles of the same gene.
- Alleles can differ in sequence between the two parents, providing genetic variation.
- Alleles can be wild-type (normal function) or mutated (altered function).
- In genetics, we focus on mutations in germline DNA (passed to offspring) for inheritance studies; somatic mutations occur in body cells and are not inherited.
- Two main categories of cells: somatic cells (e.g., skin, hair, eye, kidney cells) and gametes/germline cells. Somatic mutations can occur in any cell; germline mutations affect offspring.
- Carriers: individuals with one normal allele and one loss-of-function allele for a given locus; phenotype is usually normal, but they can pass the loss-of-function allele to offspring.
- Neutral mutations: many sequence changes do not affect function or phenotype; they contribute to genetic diversity without impacting fitness.
Wild type, mutated alleles, and functional output
- Wild type (normal) allele produces normal protein with proper function, regulation, and expression timing.
- Mutated allele can lead to loss of function or gain of function, depending on how the sequence change affects protein or regulation.
- Loss of function mutations reduce or abolish the functional output (no protein or nonfunctional protein).
- Gain of function mutations produce new or excess activity; can be toxic or cause disease when activity is inappropriate or excessive.
- Codominance is a traditional term that can be misleading; conceptually, both alleles can contribute functional outputs. The modern view emphasizes functional output: wild-type vs mutated alleles and their relative contributions.
- The term dominant/recessive is outdated for describing alleles; instead, focus on whether an allele provides a normal function, reduced function (loss of function), or increased/altered function (gain of function).
- Epistasis: interactions between alleles at different loci can influence phenotypes, such as eye color being controlled by multiple loci with potential interactions.
Inheritance patterns and terminology updates
- Inheritance terminology is evolving. The lecturer notes that traditional terms like autosomal recessive/dominant can be misleading; the emphasis should be on the functional outcomes of alleles (loss of function, gain of function, and wild type).
- Autosomal recessive diseases (classic examples taught in many courses):
- Sickle cell anemia: phenotype when two loss-of-function alleles are present (two mutant alleles with no normal allele). A carrier (one mutant allele, one normal) has a normal phenotype.
- Phenylketonuria (PKU)
- Cystic fibrosis (CF)
- Some others (e.g., certain cases of DBT-related traits) discussed as Mendelian examples.
- Autosomal dominant diseases:
- Huntington's disease, Achondroplasia (a form of dwarfism)
- In autosomal dominant cases, typically one gain-of-function allele is enough to produce a phenotype; the normal allele remains necessary to avoid lethal effects if two gain-of-function alleles were present.
- If both alleles were gain-of-function, embryonic development may be lethal; thus, individuals typically have one normal and one gain-of-function allele and express the disease phenotype.
- Important distinctions:
- An allele with gain of function is not necessarily dominant over the normal allele; the presence of a normal allele can modulate viability and phenotype.
- Loss of function alleles reduce or abolish function; disease occurs when there are two such alleles (autosomal recessive) or when one gain-of-function allele drives the phenotype (autosomal dominant).
- The instructor emphasizes: always relate genotype to functional output and phenotype, rather than relying on the old dominance terminology.
Examples and applications
- Sickle cell anemia: autosomal recessive; two loss-of-function alleles lead to disease; carriers have normal phenotype but can pass the mutation to offspring.
- Cystic fibrosis: autosomal recessive; loss of function of the CFTR chloride channel leads to thick mucus and related symptoms.
- Phenylketonuria (PKU): autosomal recessive; metabolic disorder due to loss of function in phenylalanine hydroxylase pathway.
- Huntington's disease: autosomal dominant; one gain-of-function allele causes disease; two gain-of-function alleles would be embryonically lethal.
- Achondroplasia: autosomal dominant; one gain-of-function allele leads to dwarfism phenotype; presence of a normal allele is still required in heterozygotes for viability considerations.
- Not all eye color is Mendelian; eye color is polygenic (at least 16 loci) with epistatic interactions, leading to a broad range of phenotypes. Many eye color variations are phenotypically neutral with respect to overall health.
Key concepts: regulation, diversity, and real-world relevance
- Regulatory switches (cis-regulatory elements) ensure tissue-specific and time-specific expression, enabling diverse protein repertoires from the same genome.
- Genetic diversity arises from sequence variation; most variations are neutral and do not alter function, but some can affect function and lead to disease.
- Epistasis and polygenic traits complicate simple Mendelian inheritance; many traits (like eye color) involve multiple loci and interactions.
- The genome of all somatic cells is the same, but gene expression differs by regulatory patterns across cell types; somatic mutations influence individual lifetime risk (e.g., cancer) but are not inherited.
- The instructor hints at modern technologies (e.g., CRISPR) for potential disease correction, with regulatory and ethical considerations to be discussed later.
Practical implications and study tips
- When evaluating mutations, distinguish between:
- Location: gene proper region vs regulatory switches
- Consequence: loss of function vs gain of function vs neutral
- Inheritance pattern: how many functional alleles are required for normal phenotype
- Be able to describe how a mutation in a regulatory switch could reduce protein production to 0% (loss of function), even if the coding region is intact.
- Remember the definitions in this course: wild type vs mutated allele, loss of function vs gain of function, and how these relate to phenotype, not the old dominant/recessive labels.
- Practice connecting genotype to phenotype across examples: sickle cell, CF, PKU, Huntington's, achondroplasia, and simple polygenic traits like eye color with epistasis.
- Use the ORF concept to explain why frameshift mutations disrupt protein synthesis and lead to nonfunctional proteins.
- Understand that not every mutation is harmful; many are neutral and contribute to genetic diversity and population-level adaptability.
- When analyzing pedigrees, infer potential germline mutations and consider whether the observed pattern aligns with loss-of-function or gain-of-function explanations, keeping in mind the modern terminology.
- Codon length: 3 nucleotides per codon; one amino acid per codon.
- ORF length relation: if ORF has N nucleotides, then amino acids = 3N.
- Start codon (DNA) = ATG (RNA: AUG).
- Regulatory element length: typically around 50 nucleotides.
- Human genome basics: ~22,000 genomic loci; humans are 46 chromosomes total, in 23 pairs; two alleles per locus (one from each parent).
- Examples of Mendelian diseases mentioned: sickle cell anemia, phenylketonuria (PKU), cystic fibrosis (CF), Huntington's disease, achondroplasia.
Clarifications and common questions addressed in lecture
- Difference between gene proper region and genomic locus: gene proper is the protein-coding DNA; genomic locus includes regulatory switches and noncoding DNA alongside the gene proper region.
- Why two alleles can yield a normal phenotype in autosomal recessive diseases: one normal allele provides sufficient functional output to meet the physiological threshold; a carrier remains unaffected.
- Why gain-of-function mutations can be harmful even if a normal allele exists: too much or misregulated function can be toxic; in some cases, one gain-of-function allele is enough to cause disease, while two may be embryonically lethal.
- Why traditional terms like recessive/dominant are being de-emphasized in this course: they can obscure the actual biology of how alleles function (wild type vs mutated, loss of function vs gain of function, and regulatory control).
- Concept of evolution and diversity: somatic mutations accumulate but are not passed to offspring; germline mutations shape inheritance and population genetics over generations.
Summary takeaways
- Every genomic locus contains a gene proper region plus regulatory switches; expression is tightly controlled to maintain proper protein levels across development and tissues.
- Mutations occur at the DNA level, alter ORFs or regulatory switches, and can lead to loss of function or gain of function, influencing disease risk and phenotype.
- Inheritance patterns are best understood through functional output (wild type vs mutated) and the presence/absence of functional alleles, rather than relying on outdated dominant/recessive labels.
- Complex traits often involve multiple loci and interactions (epistasis), and many mutations are neutral yet contribute to genetic diversity and resilience.
- Understanding these concepts lays the groundwork for reading pedigrees, interpreting genetic diseases, and appreciating how genetics informs biology and medicine.