Genetics and Humans
Agenda
Genetics and Humans - A. Human genetic experiments
i. Pedigrees (case-study)
ii. Environment and Genetics - Twins
iii. Denovo mutations
iv. Linkage
v. Association studies (cross-sectional) - i. Application: polygenic diseases - GWAS
ii. Historical analysis
Learning Outcomes
Describe how trio studies work to discover de novo mutations.
Describe what type of mutations trio studies were designed to identify.
Explain how the concept of linkage disequilibrium is applied to GWAS.
Extrapolate information from GWAS graphs.
Compare and contrast the benefits of GWAS to pedigree analysis.
Identify when to use pedigree analysis vs GWAS.
Analyze and describe the possible pitfalls and drawbacks of a GWAS study.
Investigation into Pedigrees and Genetic Analysis
A small percentage of the genetic basis of diseases have been identified through pedigrees.
Reason for limited identification:
Pedigrees primarily reveal monogenic traits, meaning they are caused by a single gene mutation with a large effect size (high penetrance), making the inheritance pattern clear across generations. These are often Mendelian disorders.
There is a stringent need for comprehensive generational family data, including detailed clinical and sequence data, to accurately track inheritance patterns and segregation of disease-causing alleles. This limits their utility for complex, polygenic diseases or sporadic cases.
Pedigree Analysis
Focuses on monogenic diseases (single gene disorders) and typically requires:
Large effect size mutations: These mutations have a significant and observable impact on phenotype, allowing for clear identification of affected individuals within a family tree.
Generational family data: Essential for tracking inheritance patterns (e.g., autosomal dominant, recessive, X-linked) and performing segregation analysis to identify the genetic locus responsible.
Types of Genetic Diseases:
Monogenic disorders: Caused by mutations in a single gene, often following Mendelian inheritance patterns (e.g., Huntington's disease, cystic fibrosis).
De novo mutations: These are new mutations that appear for the first time in an individual and are not inherited from either parent. They can be a cause of both monogenic and polygenic conditions.
Polygenic traits/diseases: Complex traits or diseases influenced by multiple genes, each contributing a small effect size, and often interacting with environmental factors. Pedigree analysis is generally less effective for these due to the dispersed genetic influence.
Mechanisms of Mutations
Points of Mutation Occurrence:
Mutations can occur during various stages, most notably gametogenesis (formation of sperm and egg cells) or early embryonic development. Mutations during gametogenesis can be passed to offspring if the affected gamete is involved in fertilization. Mutations during early embryonic development can result in a mosaic individual if they occur post-zygotically.
Replication Errors:
DNA replication is remarkably accurate, with an estimated error rate of approximately 1 \times 10^{-10} nucleotides per replication cycle. However, these errors can lead to point mutations, insertions, or deletions.
The cumulative effect of the immense number of cell divisions that occur during embryonic development and continuously throughout an individual's life provides numerous opportunities for DNA replication errors and environmental damage to introduce new mutations into somatic and germline cells.
Types of De Novo Mutations
Normal/Sporadic De-Novo Mutation:
This refers to the standard occurrence of a new mutation in an individual, not inherited from parents. If it's a dominant mutation, it can directly cause a phenotype.
Homogeneous De-Novo Mutation:
Occurs very early in embryonic development (often at the one-cell or two-cell stage). Consequently, the mutation is present in every cell of the resulting organism, making the individual phenotypically identical to if the mutation were inherited.
Mosaic De-Novo Mutation:
Occurs later in embryonic development or post-zygotically. The mutation is therefore present in only a subset of cells, leading to variability in symptom severity and tissue distribution depending on when and where the mutation arose during cell divisions.
Familial and De Novo Inheritance
For a de novo recessive mutation to manifest a phenotype:
If a new mutation is recessive, it typically requires a second event (like Loss-of-Heterozygosity) or the presence of another affected allele (either inherited or another de novo event) to cause a phenotype, as a single recessive allele usually does not lead to disease.
Loss-of-Heterozygosity (LOH):
Describes the loss of the normal allele at a gene locus in a cell that was previously heterozygous (carrying one normal and one mutated allele). This can occur through mechanisms such as deletion, non-disjunction, or mitotic recombination, effectively unmasking a recessive mutation or tumor suppressor gene.
Identification of De Novo Mutations
Techniques to identify de novo mutations:
High-throughput sequencing technologies, such as whole-exome sequencing (WES) or whole-genome sequencing (WGS), are applied in trio studies to compare genome sequences across individuals, specifically the affected child and both biological parents. This allows for the precise identification of variants present in the child but absent in both parents.
Use of Trio Studies in Genetic Analysis
Trio studies focus on small family units (Mom, Dad, and an affected child) for genetic investigations. This approach significantly enhances accuracy over mere comparisons with general population reference genomes by:
Filtering inherited variants: Distinguishing de novo mutations from common inherited polymorphisms.
Reducing false positives: Identifying sequencing errors by checking for Mendelian inconsistencies (variants that violate expected inheritance patterns) in the trio.
Pinpointing causality: Providing strong evidence for a de novo mutation as the likely cause of a genetic disorder, particularly in cases where parents are unaffected.
Polygenic Traits and Diseases
It is increasingly acknowledged that very few, if any, human traits or diseases are purely monogenic. Most exhibit a polygenic component and are influenced by environmental factors.
Complexity arises because traits are often influenced by multiple genes, which may exhibit duplications, redundancies (where multiple genes perform similar functions), and intricate epistatic interactions (where the effect of one gene is modified by one or several other genes). This leads to a continuum of phenotypic expression rather than clear-cut Mendelian categories.
Genome-Wide Association Studies (GWAS)
Definition:
GWAS is a powerful approach that systematically evaluates the entire genomes of thousands or even hundreds of thousands of unrelated individuals. It compares the genetic variants (primarily Single Nucleotide Polymorphisms or SNPs) of a large group with a specific disease (cases) against a group without the disease (controls) to identify genetic variants that are statistically associated with the disease.
Focus:
Primarily focuses on identifying genetic variants associated with polygenic traits and common complex diseases where each contributing mutation or variant has a small effect size, but collectively, they increase disease risk.
Linkage Disequilibrium
Key Concept:
Linkage Disequilibrium (LD) refers to the non-random association of alleles at different loci (gene locations) on a chromosome. When two alleles appear together in a population more often (or less often) than would be expected by random chance (i.e., if they were in linkage equilibrium), they are said to be in LD.
Mechanism: LD arises when two loci are physically close on a chromosome such that recombination events between them are rare. It can also be influenced by selection, mutation, migration, or population admixture.
Application in GWAS: GWAS leverages LD. Instead of needing to test every single causal variant directly, researchers can genotype a smaller, representative set of genetic markers (like SNPs) across the genome. Because these genotyped markers are in LD with ungenotyped causal variants (variants that directly influence disease risk), the genotyped markers can serve as proxies. This means that if a genotyped SNP is found to be associated with a disease, it likely indicates that a nearby, ungenotyped causal variant (or the genotyped SNP itself) is functionally involved, making the study more efficient and cost-effective.