Molecular Human Identification and Genetic Polymorphisms

Genetic Polymorphisms and Markers

  • Definition of DNA Polymorphism: A DNA sequence difference compared to a reference standard that is present in at least 1\text{%}--2\text{%} of a population.     - Polymorphisms can range in size from a single base to thousands of bases.     - They may or may not result in phenotypic effects; usually, they are considered normal genetic occurrences.     - These differences facilitate the molecular tracking of genes with clinical importance.     - They are distributed throughout the genome; on autosomes, there is approximately one sequence difference for every 10001000--15001500 nucleotides.
  • Polymorphic Sequences as Landmarks: If the location of a polymorphic sequence is known, it serves as a landmark or marker for locating other genes or genetic regions.
  • Alleles: Each polymorphic marker has different versions, known as alleles.
  • Variable DNA Sequence Forms:     - HLA Typing: Human Leukocyte Antigen typing.     - Transposable Elements:         - LINES: Long interspersed nucleotide sequences.         - SINES: Short interspersed nucleotide sequences.
  • Primary Types of Polymorphic DNA Sequences:     - RFLP: Restriction fragment length polymorphisms. These change the fragment size of DNA and are typically detected using Southern Blotting techniques.     - VNTR: Variable number tandem repeats. These units are 1010 to 5050 bp in length and are referred to as minisatellites. The total length of a VNTR region ranges from 500500 to 20,00020,000 bp.     - STR: Short tandem repeats. These units are 11 to 1010 bp in length and are referred to as microsatellites.     - SNP: Single-nucleotide polymorphisms.

Short Tandem Repeats (STR) and Microsatellites

  • Inheritance: STRs are transmitted in a Mendelian fashion, characterized by independent random assortment without linkage. One allele is inherited from each parent.
  • Analysis Methodology: Adapted for automated fluorescence analysis.     - PCR primers flank the specific locus of interest.     - Alleles are labeled with fluorescent dyes for multiplexing (testing multiple loci simultaneously).     - Commercial STR kits provide defined loci, minimal artifacts, and equal amplification across alleles.
  • STR Structure and Nomenclature:     - STRs consist of repeat nucleotide sequences:         - Dinucleotide: e.g., ATATAT…         - Trinucleotide: e.g., TAGTAGTAG…         - Tetranucleotide: e.g., TAGTTAGTTAGT…         - Pentanucleotide: e.g., TAGGCTAGGCTAGGC…     - Allele Designation: Alleles are identified by the number of repeats they contain.         - Example: A trinucleotide sequence repeated four times (TTCTTCTTCTTC) is a "44" allele.         - Example: The same sequence repeated five times (TTCTTCTTCTTCTTC) is a "55" allele.         - A heterozygous individual with both would have a genotype of 4,54,5 at that locus.
  • Visualizing VNTR/STR via Digestion and Blotting:     - Allele 1: Contains 33 repeat units, resulting in a 200bp200\,bp restriction product.     - Allele 2: Contains 66 repeat units, resulting in a 230bp230\,bp restriction product.     - Allele 3: Contains 99 repeat units, resulting in a 260bp260\,bp restriction product.
  • Primer Design and Amplicons:     - PCR primers allow for multiplexing by targeting amplicons between 100100 to 400bp400\,bp.     - Allelic Ladders: Used as benchmarks to identify alleles.     - Microvariants: Represent alleles with partial repeat units (e.g., genotype 7/87/8 and 7/107/10).     - STR Resolution: Can be performed via Polyacrylamide Gel Electrophoresis or Capillary Electrophoresis. Capillary electrophoresis displays peaks on an electropherogram corresponding to specific loci like D3S1358, VWA, FGA, and others mapped against a molecular weight standard.

Forensic and Identity Applications

  • Gender Identification (Amelogenin Locus):     - The Amelogenin locus is not an STR.     - The HUMAMEL gene codes for an amelogenin-like protein.     - It is located at Xp22.1Xp22.1--22.322.3 and on the Y chromosome.         - X allele: 212bp212\,bp.         - Y allele: 218bp218\,bp.     - Interpretation: Females (X, X) appear homozygous (one peak/band); Males (X, Y) appear heterozygous (two peaks/bands).
  • Paternity Testing:     - Comparisons are made between the child, the mother, and the alleged father (AF).     - Paternity Index (PI): The likelihood ratio of paternity for a specific locus.     - Combined Paternity Index (CPI): The product of all individual PIs.     - Example Data (Inclusion):         - Locus D16S539: Child (8, 9), Alleged Father (9, 10), Shared Allele (9), PI=5.719PI = 5.719         - Locus D5S818: Child (10, 12), Alleged Father (7, 12), Shared Allele (12), PI=8.932PI = 8.932         - Locus FESFPS: Child (9, 13), Alleged Father (13, 14), Shared Allele (13), PI=15.41PI = 15.41
  • CODIS (Combined DNA Index System):     - Inspired by the work of Sir Alec Jeffreys (RFLP and DNA Fingerprinting).     - Utilized by the Armed Forces Institute of Pathology.     - Hierarchy:         - LDIS: Local DNA Index System.         - SDIS: State DNA Index System.         - NDIS: National DNA Index System.     - Used by organizations like The Innocence Project to exonerate the wrongly convicted.

Population Genetics and Probability

  • Hardy-Weinberg Equilibrium: Describes the population frequency of two alleles, pp and qq, using the formula:     - p2+2pq+q2=1.0p^2 + 2pq + q^2 = 1.0     - Assumptions: Large population, random mating, no immigration, no emigration, no mutation, and no natural selection.     - Allows for the approximation of true allele frequencies in a population given sufficient assessment size.
  • The Product Rule: Used to calculate the matching probability of STR genotypes across multiple loci.     - 8 loci: African American (1/274,000,0001/274,000,000); White American (1/114,000,0001/114,000,000); Hispanic American (1/145,000,0001/145,000,000).     - 9 loci: African American (1/5.18×1091/5.18 \times 10^9); White American (1/1.03×1091/1.03 \times 10^9); Hispanic American (1/1.84×1091/1.84 \times 10^9).     - 14 loci: African American (1/6.11×10171/6.11 \times 10^{17}); White American (1/9.96×10171/9.96 \times 10^{17}); Hispanic American (1/1.31×10171/1.31 \times 10^{17}).
  • Inclusion/Exclusion criteria: A profile is considered different (excluded) if at least one locus genotype is different.

Uniparental Inheritance: Y-STR and Mitochondrial DNA

  • Y-STRs:     - Follow Paternal Inheritance.     - The profile is referred to as a Haplotype.     - Uses: Forensic analysis of mixed samples (male/female), lineage studies (e.g., the Hemings & Jefferson case), and population studies.
  • Mitochondrial DNA (mtDNA) Polymorphisms:     - Follow Maternal Inheritance.     - All maternal relatives share the same mitochondrial sequence.     - Mitochondrial Genome: Total size is 16,569bp16,569\,bp.     - Hypervariable Regions: HV1 (268bp268\,bp) and HV2 (342bp342\,bp).     - Unrelated individuals have an average of 8.58.5 base differences in these regions.     - Applications: Legal exclusion of individuals or confirmation of maternal lineage (e.g., Anastasia of Russia case).

Single Nucleotide Polymorphisms (SNPs) and HapMap

  • SNP Characteristics: Single-nucleotide differences between DNA sequences.
  • Frequency: One SNP occurs approximately every 1250bp1250\,bp in the human genome.
  • Detection Methods: Sequencing, melt curve analysis, and other molecular techniques.
  • Biological Impact: 99\text{%} have no biological effect; approximately 60,00060,000 are located within genes.
  • Inheritance: SNPs are inherited in organized blocks called haplotypes.
  • HapMap Project: The Human Haplotype Mapping Project aims to identify SNP haplotypes throughout the human genome for mapping genes, identification, and chimerism analysis.

Clinical Monitoring: Engraftment and Chimerism Testing

  • Monitoring Bone Marrow Transplants:     - Autologous Transplant: Recipient receives their own purged cells.     - Allogeneic Transplant: Recipient receives donor cells.
  • Chimerism: A recipient with donor marrow is considered a chimera.
  • Analysis Phases:     - Pre-transplant Informative Analysis: STRs are scanned to find "informative loci" where donor alleles differ from recipient alleles.     - Post-transplant Engraftment Analysis: Monitoring for complete chimerism, mixed chimerism, or graft failure.
  • Calculations for Engraftment:     - Peak areas are measured in fluorescence units or densitometry.     - A(R)A(R) = Area under recipient-specific peaks.     - A(D)A(D) = Area under donor-specific peaks.     - %% Recipient DNA Formula:         - \text{% Recipient DNA} = \frac{A(R)}{A(R) + A(D)} \times 100     - %% Donor DNA Formula:         - \text{% Donor DNA} = \frac{A(D)}{A(R) + A(D)} \times 100

Clinical Molecular Disorders: Nucleotide Repeat Expansions

  • Fragile X Syndrome:     - Associated with the FMR-1 gene and mental retardation.     - Expansion: CGG repeat.         - Normal: 55--5555 repeats.         - Carrier (Premutation): 5656--200200 repeats.         - Full Mutation: >200> 200 repeats (often associated with methylation).     - Detection: PCR can detect premutations (5050--9090 repeat range), but because of the large size of full mutations, Southern Blot is required for definitive detection.
  • Huntington Disease:     - Neurological degenerative disorder associated with the Huntingtin gene.     - Manifests typically in the 40’s40\text{'s}--50’s50\text{'s}.     - 50\text{%} chance of inheritance from an affected parent.     - Expansion: CAG repeat.         - Normal: 1010--2929 repeats (8080--170bp170\,bp amplicon).         - Huntington Disease: >40> 40 repeats.     - Detection: Labeled PCR primers and autoradiogram of polyacrylamide gels.