Disease Gene Mapping

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
GameKnowt Play
New
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/58

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

59 Terms

1
New cards

What is disease gene mapping and why is it important?

Disease gene mapping identifies the genomic regions and genes responsible for diseases. It helps understand disease mechanisms, enables molecular diagnosis, guides genetic counselling, and can reveal new targets for treatment.

2
New cards

How does disease gene mapping differ for rare and common diseases?

Rare diseases (<1 in 2,000): Usually caused by single-gene (monogenic) mutations. Harder to diagnose (“diagnostic odyssey”). Require specialist care.

Common diseases (>1 in 2,000): Usually polygenic (involving many small-effect variants). Easier to diagnose but have higher public health impact.

Both can be mapped genetically to uncover molecular causes.

3
New cards

Define monogenic and polygenic disease.

Monogenic: Caused by mutation in a single gene (e.g. cystic fibrosis, sickle cell anaemia).

Polygenic: Caused by variants in multiple genes plus environmental factors (e.g. diabetes, hypertension).

4
New cards

What are the key genetic concepts underlying disease gene mapping?

Genetic variation

Homologous recombination

Independent assortment

Allele frequencies

These determine how genes and variants are inherited and distributed across populations.

5
New cards

What is genetic variation and the effects?

Differences in DNA sequences between individuals in a population. They can be inherited or arise from environmental causes (e.g. radiation, drugs).

Effects:

Alters protein function (e.g. missense mutations)

Affects gene regulation (when/where/how a gene is expressed)

Influences phenotype (observable traits and disease risk)

Can also be silent (no effect on phenotype)

6
New cards

What are the main mechanisms creating genetic variation?

1. DNA replication errors → mutations, polymerase slippage

2. Independent assortment → random distribution of chromosomes during meiosis

3. Homologous recombination → exchange of chromosome segments between homologous chromosomes

7
New cards

What is homologous recombination?

It’s the reciprocal exchange of DNA segments between homologous chromosomes during meiosis, producing new allele combinations.

8
New cards

What is the difference between a mutation and a polymorphism?

Mutation: A rare DNA variant (low population frequency).

Polymorphism: A common variant (≥1% frequency).

Both represent DNA changes compared to the reference sequence, but frequency distinguishes them.

9
New cards

What is Minor Allele Frequency (MAF)?

The frequency of the less common allele in a population.

MAF < 1% = mutation

MAF ≥ 1% = polymorphism

MAF helps determine if a variant is rare or common and varies across populations.

Example of allele frequencies (APOE rs7412 C>T variant):

C (major allele): 67%

T (minor allele): 33%

Minor allele frequency (MAF) = 0.33

10
New cards

What does “allele frequency” mean? + Why does allele frequency vary across populations?

It’s the proportion of a particular allele among all copies of a gene in a population.

Because of genetic drift, founder effects, natural selection, migration, and differences in ancestry.

11
New cards

What is the relationship between allele frequency and effect size in disease?

Rare alleles → larger effect sizes (cause monogenic disorders)

Common alleles → smaller effects (contribute to complex traits)

12
New cards

What is meant by ‘mapping’ a disease gene?

It’s the process of identifying the chromosomal location of a gene responsible for a disease, using methods like linkage analysis or GWAS.

13
New cards

What are the two main approaches used to map disease genes? + Why do we need different mapping approaches for rare vs. common diseases?

1. Linkage Analysis: Tracks inheritance of disease alleles within families (for rare diseases).

2. Genome-Wide Association Studies (GWAS): Compares variant frequencies between unrelated cases and controls (for common diseases).

Rare diseases have strong single-gene effects → family-based linkage analysis works best.

Common diseases have small multi-gene effects → require population-based GWAS to detect associations.

14
New cards

What is meant by genetic linkage?

Genetic linkage refers to the tendency of genes or DNA markers that are physically close together on the same chromosome to be inherited together during meiosis because crossing-over is less likely to occur between them.

15
New cards

What is the basic principle of linkage analysis?

Linkage analysis looks for co-segregation of genetic markers with a disease within families.

If a marker allele is inherited by all affected relatives and rarely by unaffected ones, it suggests the marker lies close to the disease gene on the chromosome.

16
New cards

What is the difference between “linkage” and “independent assortment”?

Independent assortment: Loci far apart (or on different chromosomes) segregate randomly.

Linkage: Loci close together segregate non-independently, remaining together more often than expected by chance.

17
New cards

Describe the two scenarios that illustrate linkage principles.

Disease gene far from marker → High recombination frequency → Independent assortment.

Disease gene near marker → Low recombination → Co-inheritance (non-recombinants dominate).

18
New cards

What is a recombinant vs. non-recombinant?

Recombinant: Offspring chromosome that has undergone crossing-over between marker and disease gene.

Non-recombinant: No crossover occurred; the marker and gene stayed together.

19
New cards

What is a haplotype?

A haplotype is a set of alleles at multiple loci on the same chromosome that are inherited together from one parent.

Helps track the transmission of linked genetic variants across generations.

20
New cards

How is linkage analysis conducted (basic workflow)?

  1. Genotype multiple markers (microsatellites or SNPs) across the genome.

  2. Collect DNA from multiple family members with the trait.

  3. Identify markers that co-segregate with disease in affected relatives.

  4. Markers consistently inherited with the disease pinpoint a candidate genomic region for the causal gene.

21
New cards

What are genetic markers?

Known, heritable DNA sequence variants used to track inheritance. Two main types:

Microsatellites (short tandem repeats, STRs)

Single Nucleotide Polymorphisms (SNPs)

Microsatellites are ideal for family-based linkage analysis because of their variability, whereas SNPs are preferred for genome-wide mapping and association studies because of their abundance and ease of automated genotyping.

22
New cards

What are microsatellites?

Short tandem repeats of 1–6 bp sequences (e.g. CACACACAC).

Highly polymorphic → many allele sizes.

Detected using fluorescently-labelled PCR primers and capillary electrophoresis.

Commonly used in family-based linkage mapping due to high variability.

23
New cards

What are SNPs (Single Nucleotide Polymorphisms)?

DNA variants at a single nucleotide position (e.g. C→T).

Usually biallelic (two possible alleles).

Provide dense coverage across the genome.

Genotyped on microarrays using fluorescent probes.

Used in both linkage and association studies.

24
New cards

What does “genome coverage” mean in linkage studies?

It refers to how evenly genetic markers span the genome.

Example: CIDR microsatellite panel vs. Affymetrix 10k SNP array show regions covered vs. gaps (centromeres/telomeres).

25
New cards

What are the steps for building a haplotype in a pedigree? + What is the “critical disease interval”?

  1. List genotypes for each marker in family members.

  2. Deduce which allele each parent passed to each child.

  3. Identify recombination points.

  4. Visualise paternal and maternal haplotypes across loci.

Critical disease interval:

The shared region of the chromosome (shared haplotype) inherited by all affected individuals but not by unaffected ones — narrowing down where the disease gene lies.

This interval contains the causal gene and guides further sequencing.

26
New cards

What does “refinement of the critical interval” involve?

Comparing recombination events across multiple affected relatives to define the smallest overlapping segment that must contain the causal gene.

27
New cards

Why is sequencing still needed after linkage analysis?

Linkage shows where the gene is located, not what the causal variant is. Sequencing the region identifies the specific mutation.

28
New cards

What is linkage mapping using genetic markers?

Linkage mapping uses an observed locus (genetic marker) to make inferences about an unobserved locus (the disease gene).

If a marker is genetically linked to the disease locus, affected relatives will share the same marker alleles more often than expected by chance.

29
New cards

What does it mean when a marker is “unlinked” to a disease locus?

If the marker and disease gene are far apart, recombination occurs frequently. Affected family members are no more likely to share the same marker alleles than expected by random segregation.

30
New cards

What does it mean when a marker is “linked” to a disease locus?

The marker and disease gene are close together, so recombination is rare. Affected relatives inherit the same marker alleles together more often than by chance, revealing a region likely to contain the disease gene.

31
New cards

How are haplotypes used in linkage mapping?

Haplotypes show which alleles at different loci were inherited together from each parent. By comparing haplotypes in affected and unaffected relatives, researchers can identify the region of the chromosome that co-segregates with the disease.

32
New cards

What does a haplotype represent in a pedigree analysis? + How is a haplotype built in linkage analysis?

Each parent contributes one haplotype (set of marker alleles) to each child. By tracing these through the pedigree, recombination events can be seen where a child inherits a mix of alleles from different parental haplotypes.

How is a haplotype built in linkage analysis?

1. Determine genotypes at multiple marker loci for each family member.

2. Assign which alleles are inherited from the maternal and paternal chromosomes.

3. Align them in order along the chromosome.

4. Detect recombination events and mark the crossover boundaries.

33
New cards

How do recombination events refine the disease locus?

By comparing recombination breakpoints across multiple affected individuals, researchers can identify the smallest shared haplotype segment that all affected individuals have in common — this defines the critical disease interval.

34
New cards

What statistical method is used to assess linkage strength?+ What does a LOD score represent mathematically?

The LOD score (Logarithm of the Odds) — a statistical measure used to evaluate whether two loci are likely to be linked versus unlinked.

LOD = log₁₀ (Probability of data if loci are linked / Probability of data if loci are unlinked)

It compares how likely the observed inheritance pattern is under linkage versus random segregation.

35
New cards

What do positive and negative LOD scores indicate?+ What is the equivalent p-value for a LOD score of 3?

LOD ≥ 3: Significant evidence for linkage (odds of 1000:1 that loci are linked).

LOD ≤ -2: Evidence against linkage.

Scores between -2 and 3 are inconclusive and require more data.

Approximately p = 0.05, which is considered genome-wide significance for linkage.

36
New cards

Why are LOD scores considered additive?

Data from multiple families linked to the same locus can be combined by summing their LOD scores. A cumulative LOD ≥ 3 across families strengthens the overall evidence for linkage.

37
New cards

What does a LOD score graph look like?+ What happens after a linkage peak is identified?

A line graph plotting LOD score (y-axis) against chromosome position (x-axis). Peaks indicate potential linkage regions. A peak ≥ 3 suggests the likely position of the disease gene.

What happens after a linkage peak is identified?

The candidate region is sequenced to pinpoint the exact variant responsible for the disease. Linkage reveals the approximate location; sequencing provides the precise genetic change.

38
New cards

What are the main limitations of linkage analysis?

Requires large, multi-generation pedigrees with clear inheritance patterns.

Less effective for complex (polygenic) diseases.

Recombination events can blur boundaries if marker density is low.

Cannot detect variants directly; sequencing is still needed.

39
New cards

What does the term “association” mean in genetics?

Association refers to the observation that two factors occur together more often than expected by chance. In genetics, a genetic association occurs when a particular allele or variant is found more frequently in people with a disease (cases) than in people without it (controls).

40
New cards

What is a Genome-Wide Association Study (GWAS)?+ How does GWAS differ from linkage analysis?

GWAS is a method used to identify common genetic variants across the genome that are statistically associated with a particular trait or disease by comparing large groups of unrelated cases and controls.

GWAS examines whether certain alleles are more common in affected individuals (cases) compared to unaffected individuals (controls). A significantly higher frequency in cases suggests that the variant or nearby variant contributes to disease risk.

  • Linkage analysis: family-based, best for rare/monogenic diseases; identifies large chromosomal regions linked to disease.

  • GWAS: population-based, best for common/polygenic diseases; tests hundreds of thousands of variants across the genome for statistical association.

41
New cards

What type of variants are tested in GWAS?

GWAS focuses mainly on Single Nucleotide Polymorphisms (SNPs) because they are abundant, stable, and easy to genotype across the entire genome using DNA microarrays.

What is a SNP microarray and how does it work?:

A SNP microarray (or genotyping chip) contains thousands of probes that detect specific nucleotide variants.

Each SNP is tested by hybridisation to fluorescently labelled DNA.

• Homozygous allele 1 → one colour signal

• Homozygous allele 2 → another colour

• Heterozygous → mixed colour

The results show each person’s genotype for each SNP.

42
New cards

What are the genotype and allele frequencies used for in GWAS?

They are used to calculate whether an allele is significantly more common in cases than controls.

Example: In a case/control study of 1,000 individuals each, allele counts are compared using statistical tests such as chi-squared.

43
New cards

How is genotype converted to allele counts?

For a biallelic SNP (C/T):

• 5 × CC = 10 C alleles

• 4 × CT = 4 C + 4 T

• 1 × TT = 2 T alleles

Total: C = 14, T = 6

This allows frequency comparisons between groups.

44
New cards

What is Minor Allele Frequency (MAF) and why is it important in GWAS?

MAF represents the frequency of the less common allele in a population. It helps determine which variants are common enough to study. GWAS typically focuses on variants with MAF > 1%.

MAFs differ widely between populations, influencing the interpretation of association results.

45
New cards

What statistical test is used in GWAS?+ What does a p-value represent in GWAS?

The chi-squared (χ²) test is used to compare genotype or allele frequencies between cases and controls. The results are expressed as p-values indicating the likelihood that the observed difference occurred by chance.

The probability that the observed association occurred by random chance.

  • Low p-value → strong evidence of association.

Results are often displayed as –log₁₀(p-value) for visual clarity (larger peaks = stronger association).

Why are p-values adjusted for multiple testing in GWAS?:

Because millions of SNPs are tested simultaneously, some will appear significant by chance alone.

The Bonferroni correction adjusts the threshold for significance:

Genome-wide significance = p < 5 × 10⁻⁸.

46
New cards

What is a Manhattan plot?+ What does the peak in a Manhattan plot indicate?

A graph used to display GWAS results across all chromosomes.

• X-axis: chromosome position

• Y-axis: –log₁₀(p-value)

Each dot = one SNP

Peaks represent SNPs with significant associations.

What does the peak in a Manhattan plot indicate?:

The chromosomal region most strongly associated with the disease. The exact causal variant may not be the top SNP itself but another variant in linkage disequilibrium with it.

47
New cards

What is a regional association plot?

A zoomed-in plot of one genomic region showing the lead SNP and surrounding variants. Each variant is coloured by its degree of linkage disequilibrium (r²) with the lead SNP, showing how associations cluster together.

It visualises the pattern of association across a small genomic region.

• Each point = a SNP.

• X-axis = genomic position.

• Y-axis = –log₁₀(p-value).

• Colour = strength of LD (r²) with the lead SNP.

This helps identify whether nearby SNPs form a cluster of association.

48
New cards

What are the main outputs of a GWAS?+ What are the strengths and limitations of GWAS?

Manhattan plot (genome-wide significance results)

Regional association plot (local signal detail)

List of significant SNPs with their p-values, odds ratios, and nearby genes.

What are the strengths of GWAS?:

Detects common variants associated with complex traits.

Covers the entire genome without prior knowledge of candidate genes.

Can reveal biological pathways involved in disease.

What are the limitations of GWAS?:

Requires very large sample sizes.

Explains only a small fraction of heritability.

Does not prove causation (association ≠ function).

Results can vary across populations due to allele frequency differences and LD patterns.

49
New cards

What is linkage disequilibrium (LD)?

Linkage disequilibrium (LD) is the non-random association of alleles at two or more loci. It means that certain combinations of genetic variants occur together more often than expected by chance in a population.

50
New cards

How is LD different from genetic linkage?

Genetic linkage: Physical proximity of loci on a chromosome; observed in families (inherited together).

Linkage disequilibrium: Statistical association of alleles at loci across the population; not necessarily caused by physical proximity but often related to it.

Summary: Linkage = family-level co-inheritance. LD = population-level correlation of alleles.

51
New cards

Why is LD important in GWAS?

GWAS often detects tag SNPs that are not causal themselves but are in strong LD with the true disease-causing variant.

LD allows researchers to identify genomic regions associated with a trait without directly genotyping every variant.

52
New cards

What factors influence the extent of LD?

Recombination rate

• Mutation rate

• Genetic drift and population history

• Natural selection

• Population structure (e.g. bottlenecks or admixture)

LD tends to decay over distance; closer loci have stronger LD.

53
New cards

What is the measure of LD strength (r²)?

r² is a statistical measure (0–1) describing how strongly two loci are correlated.

r² = 1 means complete LD (alleles always inherited together).

r² = 0 means loci segregate independently.

54
New cards

How is linkage disequilibrium used in fine-mapping?

Fine-mapping uses LD patterns to narrow down the region likely to contain the causal variant by analysing which SNPs remain associated after accounting for correlation between markers.

55
New cards

What is meta-analysis in the context of GWAS?+ How are meta-analysis results visualised?

Meta-analysis combines GWAS results from multiple cohorts or studies to increase statistical power. It helps confirm true associations that replicate across independent populations.

They are visualised typically with Manhattan plots showing combined –log₁₀(p-values) across all studies.

Each significant locus that replicates across studies is highlighted as a validated association.

56
New cards

What are Manhattan plots used for?+ How do you interpret a Manhattan plot?

To display genome-wide statistical results from GWAS or meta-analysis.

They help visualise which chromosomes and regions contain variants with the strongest disease associations.

Each point = one SNP.

X-axis = chromosome number.

Y-axis = –log₁₀(p-value).

Red dotted line = genome-wide significance threshold (p < 5×10⁻⁸).

Peaks above this line suggest regions associated with disease risk.

57
New cards

What are the limitations of LD-based association results?

The top SNP may not be the causal variant.

LD patterns differ between populations.

Structural variants and rare variants may be missed.

Experimental validation is required to confirm causality.

58
New cards

What is the overall purpose of combining linkage, LD, and GWAS data?

Integrating these approaches helps identify both rare high-effect variants (linkage) and common low-effect variants (GWAS), offering a comprehensive understanding of genetic risk factors for disease.

59
New cards

In summary, what are the key takeaways from Disease Gene Mapping?

  • Genetic variation underlies disease risk and inheritance.

• Linkage analysis maps rare, high-effect variants using family data.

• GWAS maps common, low-effect variants using population data.

• Statistical tools like LOD scores and p-values measure linkage or association.

• LD connects nearby variants, allowing indirect mapping.

• Sequencing identifies the exact causal mutation.

Combined, these methods form the foundation of modern human genetics and precision medicine.