Molecular basis of genetic polymorphisms - comprehensive notes
From Mendel’s traits to genes
Allele = variant at a genetic locus; concept of antagonistic pairs as Mendel studied traits.
Alleles underlie phenotypic variation; mutations create new alleles and allelic diversity.
Mutation and inheritance
Mutation is the process whereby genes change from one allelic form to another; new alleles can arise.
Mutations occur randomly, at any time and in any cell of an organism.
Mutations can arise spontaneously during normal DNA replication or be induced by a mutagen.
Only mutations in germline cells can be transmitted to progeny; somatic mutations cannot.
Inherited mutations appear as alleles in populations of individuals; mutations are the source of allelic variation.
Germline vs somatic mutations
Germline mutations: in germinal tissue; passed to offspring; contribute to inherited variation.
Somatic mutations: in somatic tissue; can create a mutant sector within an organism but are not inherited by progeny.
Conceptual representation: germline mutations in germinal tissue lead to mutant progeny; somatic mutations affect only the individual.
Allele frequency and polymorphism
Allele frequency = percentage of the total number of gene copies represented by one allele.
Wild-type allele – frequency ≥ 1%.
Mutant allele – frequency < 1%.
Monomorphic – gene with only one wild-type allele.
Polymorphic – gene with more than one wild-type allele.
Forward mutation – changes wild-type allele to a different allele.
Reverse mutation – reversion to wild-type allele (novel mutation).
Mutations are the source of allelic variation.
Mutation rates and dynamics
Mutation rate varies from 1 in 1,000 to 1 in 1,000,000,000 per gene per gamete.
The forward mutation rate is almost always higher than the reverse rate.
Mutations can occur during normal DNA replication.
The mutation rate can increase after exposure to a mutagen (e.g., UV light, certain chemicals).
Quantitative reference:
Forward vs reverse mutation rates:
Typical range: bc \in [10^{-3}, 10^{-9}] \text{ per gene per gamete} (varies by gene and context).
Types of DNA mutations
Substitution – a base is replaced by one of the other three bases.
Deletion – loss of a block of one or more DNA base pairs.
Insertion – addition of a block of one or more DNA base pairs.
Inversion – rotation of a piece of DNA.
Reciprocal translocation – parts of nonhomologous chromosomes exchange places.
Chromosomal rearrangements – affect many genes at once.
Single nucleotide polymorphisms (SNPs) and polymorphism
SNPs are alleles differing at a single nucleotide position.
Polymorphism = detectable difference at a given locus/gene; this difference is what defines an allele.
DNA sequence differences can be used to infer allelic variation.
From DNA to phenotype: central dogma and alleles
Mendel’s traits are encoded in DNA; organismal traits arise from gene expression.
DNA -> mRNA -> protein; allelic differences at the DNA level can influence mRNA expression and/or protein function, thereby affecting phenotype.
Flow: DNA sequence variations can alter transcription, RNA processing, translation, and protein function.
Gene structure
Prokaryotic/bacterial genes: promoters regulate transcription of one region or more genes; transcription produces mRNA and translation yields proteins.
Eukaryotic genes: contain introns and exons that are transcribed; promoters regulate transcription initiation; splicing removes introns to produce mature mRNA.
Key features in gene architecture:
Promoter
Coding sequence
Start codon (ATG) and stop codon (TGA)
Exons (coding segments) and introns (intervening sequences)
Transcriptional and post-transcriptional processing yields mature RNA that is translated into protein.
Gene expression and alleles: examples
Start codon: ATG; stop codon: TGA.
Transcription produces mRNA; splicing produces mature RNA (exons joined, introns removed).
Translation produces nascent protein, which folds into functional protein.
Wild-type allele vs mutant allele: mutations can affect transcription, splicing, translation, or protein folding, leading to altered or nonfunctional proteins.
Allelic variation and disease: PKU (phenylketonuria)
PKU overview: buildup of phenylalanine; lack of tyrosine; potential seizures and mental/mood disorders.
Enzymatic cause: Phenylalanine hydroxylase deficiency; leads to phenylalanine buildup and downstream tyrosine depletion.
Biochemical consequence: accumulation of phenylpyruvic acid affecting nervous system development.
Genetic basis: mutations can occur in PAH gene; mutations can be in exons or introns (affecting splicing) and inactivate the gene.
BRCA1 and disease risk
BRCA1: tumor-suppressor gene involved in repairing DNA damage.
Mutations in BRCA1 can disrupt DNA repair and increase cancer risk, particularly breast and ovarian cancer.
Population data: hundreds of BRCA1 mutations identified; common risk figures include ~12% baseline risk in general population vs ~60% risk for those with harmful BRCA1 mutations.
Functional consequences of mutations
Wild-type phenotype arises when two copies of the wild-type allele are present.
Mutant alleles can have several effects:
Loss-of-function (null/amorph) – little or no functional gene product.
Gain-of-function (hypermorphic) – increased function or new function.
Gain-of-function (neomorphic) – introduces a new function/structure.
Leaky/hypomorphic – partial loss of function; reduced but not abolished activity.
Haploinsufficiency – one wild-type allele is not enough for normal function.
Dominant-negative – mutant allele produces a product that interferes with normal product from the wild-type allele.
Haploinsufficiency and dominance concepts
Haplosufficiency: one wild-type allele provides enough gene product for normal function (e.g., 50% activity can be sufficient).
Haploinsufficiency: one WT allele not enough to maintain normal function.
Dominant-negative: mutant product disrupts function of the normal product in heterozygotes, often seen with multimeric proteins.
Practical inheritance patterns with enzyme activity examples
Example: recessive loss-of-function in enzyme activity
WT allele (R+) yields active enzyme (e.g., 50 units).
Mutant allele (r) yields little/no active enzyme (0 units).
WT phenotype when total activity ≥ threshold (e.g., 40+ units).
Genotypes: R+R+ (WT), R+r (WT/haplosufficient), rr (mutant phenotype).
Example: dominant loss-of-function and haploinsufficiency
T1T1 (20 units) may be WT; T1T2 (15 units) and T2T2 (10 units) show mutant phenotype.
The T2 allele is dominant; the WT T1 allele is haploinsufficient (one copy not enough for normal function).
Example: haploinsufficiency and dominance illustrated with dosage of protein products.
Example: dominant negative mutations – interactions between mutant and normal gene products cause abnormal phenotypes.
More on mutation outcomes
(e) Loss of function – haplosufficiency vs haploinsufficiency; 50% protein can be enough or not depending on system.
(f) Gain of function – neomorphic mutation introduces a new function or novel structure.
Detection of allelic polymorphism at the molecular level
Techniques: PCR and DNA sequencing; new technologies enable visualization of allelic polymorphism.
Ultimate detection resides at DNA sequence level; polymorphism can be detected from DNA to protein levels.
Analyses performed on the diploid nuclear genome.
PCR and DNA sequencing: workflow and interpretation
PCR amplification provides a comprehensive picture of the region of interest.
DNA sequencing reveals the exact nucleotide sequence across the amplified region.
Sequencing data supports detection of SNPs, insertions/deletions, and more complex variants.
Example visualization challenges include handling fragment bases and reads; real data often includes multiple reads and alignment considerations.
SNP detection and disease
SNP-based screening can identify carriers or affected individuals.
Example: recessive disease screening using SNP profiles (e.g., AA no disease, AG carrier, GG diseased).
How to screen for BRCA1-related breast cancer risk
Gene sequencing is the most comprehensive approach but expensive; BRCA1 > ~80,000 bp.
SNP detection for common mutations can be cheaper but less comprehensive.
A suggested strategy: screen the affected individual using gene sequencing to identify causal mutation; if a causal SNP is found, use targeted SNP detection to screen at-risk relatives (full gene sequencing not required for relatives if the causal SNP is known).
New technologies in genetics
Next-generation sequencing (NGS) enables massive parallel sequencing at much lower costs; Illumina is a common platform.
Visualization tools (e.g., IGV) help inspect read alignments, coverage, and variant calls across the genome.
Example data: scaffold coordinates; read coverage across genomic regions; long-read vs short-read strategies.
Probability rules in genetics
Product rule (multiplication rule): P(A and B) = P(A) × P(B) for independent events.
Sum rule (addition rule): P(A or B) = P(A) + P(B) − P(A and B).
Conditional probability and binomial probability concepts are used to infer genotype/phenotype proportions in crosses and samples.
Example: independent loci in a dihybrid cross yield probabilities calculated by multiplying individual locus probabilities.
Applying product rule: worked example (as in slides)
For a genotype with loci A, B, C, S, use:
P(RRYYTTSS) × P(rrYYttss) to compute the combined probability of a multi-locus genotype.
Example calculation shown in slides: product of individual locus probabilities, e.g., 2/4 × 2/4 × 2/4 × 1/4 = 8/256 = 1/32 for a multi-locus genotype.
Mutually exclusive events and sum rule in genetics
When events are mutually exclusive, probabilities add directly: P(A or B) = P(A) + P(B).
Practice problems (overview of questions given in the slides)
Practice problem #1: Determine the type of individual given a genotype A/A; B/B; c/c; D/d.
Options: A) monohybrid B) dihybrid C) trihybrid D) tetrahybrid
Practice problem #2: Cross genotype with multiple traits; compute proportion of progeny phenotypically identical to the first parent.
Practice problem #3–#4: Similar multi-trait genotype crosses and identically/identically to a parent calculations.
Practice problem #5: Classical inheritance problem involving a new mutation and penetrance/homozygosity; interpret allele type from F2 phenotypic ratios.
Practice problem #6: Pedigree-style question about a dihybrid cross for two traits with different dominance relationships; calculate probabilities for specific phenotypes.
Summary of key concepts to remember
Allele and mutation concepts; germline vs somatic inheritance.
Definitions: wild-type, mutant, monomorphic, polymorphic; forward vs reverse mutations; allele frequency.
Mutation rates across genes; effects of mutagens; general patterns of forward > reverse.
Types of DNA mutations and their consequences on gene function and phenotype.
SNPs and polymorphisms; detection methods from DNA to protein level.
Gene structure in eukaryotes vs prokaryotes; transcription, splicing, and translation; regulatory elements.
Allelic variation and disease: PKU and BRCA1 as examples of how mutations cause disease.
Functional consequences of mutations: loss-of-function, gain-of-function, haploinsufficiency, haplosufficiency, dominant-negative, neomorphic.
Central dogma and how mutations can alter gene expression and protein function.
Detection technologies: PCR, DNA sequencing, and emerging high-throughput sequencing; practical screening strategies.
Probability rules in genetics: product rule, sum rule, independent events, conditional probability, and basic binomial concepts.
Practice problems illustrate the application of these concepts to real-world genetic questions.