L7, L8, & L9 - Hypothesis Testing, Genetic Variation in Populations, and Genetic Mapping in Populations

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/18

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 1:01 AM on 2/25/25
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

19 Terms

1
New cards

Hypothetico-Deductive Approach

Step 1) Assume a hypothesis is true eg; X-linked recessive inheritance for a pedigree with no affected parents but two affected offspring (one male, one female)

Step 2) Deduce some necessary consequences eg: affected XX individuals must have an affected XY father

Step 3) See whether the data is consistent with those deductions eg: in this case, since the father is unaffected, the mode of inheritance cannot be X-linked recessive.

2
New cards

Null hypothesis

A simple model that makes explicit, quantitative predictions, assigning probabilities to all outcomes. When conducting an experiment and analysing the data, we have already assumed that the null hypothesis is true, thus we will not use the data to speak on the probability that the hypothesis is true. If the data is not consistent will this hypothesis, we can reject it. The point is that if the data can be explained with a simple model such as this one, then the data won’t support a more complicated model.

3
New cards

p-value

The probability of observing a result as different from our expectation as the one we actually observed, if the null hypothesis is true (we are working under the assumption that it is true). Thus, this is a measure of how likely the data we observed is given the hypothesis we came up with. This has to include the probability of more extreme outcomes than the one observed as well (so it is the probability of observing an outcome AT LEAST as extreme as the one that was observed). Graphically, this is the area under the curve starting from the chi-square value.

4
New cards

Rejection of the null hypothesis in a regular study

If the p-value is < 0.05, then we say that the data is statistically significant and we can reject the null hypothesis because something else could be going on.

5
New cards

Chi-square test statistic

This is a standardised difference from the expectation that is used to measure and describe the observed data with respect to the expectations. Calculated via finding the sum of (Observed - Expected)²/ Expected for every observed value.

If the null hypothesis is true, then chi-square values come from a probability distribution (that depends on the number of independent classes of observation/ degrees of freedom). The p-value in this case also becomes the percentage of experiments in which we would observe a chi-square value at least as high as the one we observed.

6
New cards

Degrees of freedom

The number of observations that can vary independently (number of categories - 1). This determines the shape of the chi-square distribution. It is also known as the number of independent classes of observation.

7
New cards

Mutations

Mutations create genetic diversity. Mutations in somatic cells (like mutilations) are not transmissible to offspring meanwhile mutations in the germline are transmissible.

8
New cards

Germline mutations

These are transmissible mutations caused by spontaneous errors in DNA replication (so they happen at a low rate). These mutations can happen anywhere in the genome and they generate new alleles that segregate among offspring.

9
New cards

Somatic Mutations

These are mutations in somatic cells, for example physical mutilations, that are NOT transmissible to offspring. They can cause cancer if they affect the cell cycle and cause over proliferation of cells.

10
New cards

Genotype frequency

Describes the fraction of the population that has a certain genotype at a particular SNP (eg: AA, AT, or TT). Paa = (number of individuals in the population with AA genotyppe)/ (total population)

11
New cards

Allele frequency

Describes the fraction of the population with a certain allele. For a diploid population, you must multiply the population number by 2 to get the number of alleles in the population. To account for the number of alleles a homozygote has, you must multiply the number of people with that genotype by 2. Heterozygotes do not need to be multiplied by two because they one have one copy of that allele.

12
New cards

Variations

SNPs: These are small sections of the genome where a base pair varies enough within a population to be considered as normal (not a rarity).

Indels: These are insertions or deletions that cause polymorphisms in the genome (areas of the genome that vary within a population). These can be synonymous (silent mutations that don’t change the amino acids and thus the protein being coded for) or nonsynonymous (change an amino acid in the protein).

13
New cards

Hardy-weinberg equilibrium

After one generation of random mating, genotype frequencies reach a stable equilibrium defined by the allele frequencies. Characterises the genotype frequencies in populations that are not evolving - states that allele and genotype frequencies will remain constant if there are no other evolutionary influences. Random mating is required for this to occur/be true (randomly sampling a sperm and randomly sampling an egg)

Formulas: Probability of homozygotes = (probability of the allele)² or (allele frequency)² or q²

Probability of heterozygotes = 2(frequency of allele 1)(frequency of allele 2) or 2pq - the factor of 2 is to account for the fact that there are two ways that the heterozygote can come about - 1/2 and 2/1

14
New cards

Genetic drift

In finite populations, allele frequencies change because each generation only samples a small number of alleles from the previous generation. AtAt some points in time, one allele is way more represented in the population than the others but at other times, they might be equal and at other times, the other allele might become overrepresented. Larger populations experience smaller changes in allele frequencies while smaller populations experience large changes in allele frequencies. There is also the possibility, during this drifting, for certain alleles to become lost (if their frequency reaches 0 or for the allele to become fixed in the population if the allele frequency becomes 100%).

This results in loss of genetic variation

15
New cards

Bottleneck event

An event that dramatically decreases the size of the population, leading to a decrease in genetic diversity.

16
New cards

Genome Wide Association Study (GWAS)

This is a research method used to identify genetic variants that are associated with specific traits or diseases by scanning the entire genome of many individuals. When conducting a GWAS on an SNP for example, the null hypothesis is that the SNP genotype is unrelated to the phenotype (meaning it is not causing the phenotype or affecting it in any way - they are not associated with each other).

If the data is significant, then we can reject this null hypothesis, but the nature of GWAS makes it hard to figure out the causal variant because there might be a large number of genes that are in linkage disequilibrium with the most significant SNP, making it difficult to know which of these is a causal variant, especially because there are many causal variants that affect the phenotype, each in a very small way.

Additionally, most of the variants that affect phenotypes are non-coding regulatory variants that act by influencing mRNA levels.

17
New cards

Linkage Disequilibrium

Consider two SNPs in a population of diploids. The genotypes are these two nearby loci are not independent (they tend to be correlated/ associated because they’re so close to each other) and thus will be inherited together. This means that the probability of having for example T and locus 1 and A at locus two are not multiplicative.

This phenomenon makes it so that each observed locus is informative about nearby unobserved loci.

18
New cards

GWAS graphs

x-axis: position in the genome

y-axis: -log10(p-value) — this means that the higher the y-value, the smaller the p-value. However, if we were to reject the null hypothesis every time that the p-value is less than 0.5, we would end up rejecting the null hypothesis even when it’s actually true, 5% of the time. Because of this, in a GWAS, we use a threshold of 0.05 divided by the number of tests to determine the p-value that is used as a y value in this graph.

19
New cards

Polygenic Score

A sum of the (allele effect x allele dose) across all associated loci using estimates of the allelic effect at each locus from GWAS data. For example, locus 1 (A/C) = +0.1 mm for each A allele, locus 2 (T/G) = +0.2 mm for each T allele and locus 3 (G/C) = -0.1 mm for each G allele. If someone has the genotype AC GG CG, then their phenotype is 0.1 + 0 + -0.1 = 0 mm and if someone has the genotype AA TT CC, their phenotype = 0.2 + 0.4 + 0 = 0.6 mm.

General formula = (genotype at locus 1 x effect of locus 1) + (genotype at locus 2 x effect of locus 2) + (genotype at locus 3 x effect of locus 3) +…. (genotype at locus n x effect of locus n)

We use this to predict the phenotypes of offspring, and these predictions can be used to selectively breed plants and livestock with certain desired qualities or to figure out if offspring or if individuals are likely to develop certain diseases or have certain traits.