BIOL 300 Midterm Memorize

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
GameKnowt Play
New
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/72

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

73 Terms

1
New cards

conditions for binomial test 

  • test whether a population proportion (p) matches a null expectation for the proportion 

  • one variable 

  • categorical variable 

  • two categories (success, fail) 

2
New cards

what is the flow chart for deciding which test to use?

How many variables?

  • ONE 

    • categorical 

      • 2 categories 

        • BINOMIAL TEST 

      • 2+ categories 

        • x² GOF - PROBABILITY DISTRIBUTION 

    • numerical 

      • x² GOF - POISSON DISTRIBUTION  

  • TWO 

    • sample size assumption met 

      • x² CONTINGENCY TEST 

    • sample size assumption not met 

      • FISHER’S EXACT TEST 

3
New cards

conditions for x² goodness of fit test

  • use frequency data to test whether a population proportion equals a null hypothesized proportion that comes from a specified probability distribution

  • use if sample size is too large for binomial test (can be very tedious to calculate the observed and more extreme many times; but less precise) 

  • one variable 

  • categorical or discrete numerical variable

  • no more than 20% of categories with expected < 5 & no category with expected < 1 & random sampling 

4
New cards

what are the conditions for the x² contingency test 

  • want to test the association of two or more categorical variables

  • two or more variables

  • categorical variables

  • no more than 20% of categories with expected < 5; no category with expected < 1; random sampling

5
New cards

what are the conditions for fisher’s exact test?

  • want to test the association of two categorical variables by calculating the exact probability of the sample under the null hypothesis (useful when the assumptions of x² test are violated) 

  • two variables 

  • categorical variables 

  • random sampling 

6
New cards

what are the test statistics for the binomial test, x² goodness of fit test, x² contingency test and the fisher’s exact test

(test statistic: value(s) calculated from the sample that are relevant to testing the null hypothesis)

binomial: p-hat (sample proportion) or x (number of successes of n trials)

  • x is used to calculate p-hat

x² gof:

x² contingency test:

fisher’s exact test: none (not really done by hand)

7
New cards

how to decide over binomial test and x² gof test

  • choose binomial when gof assumption not met 

  • choose gof if too tedious to calculate probabilities of binomial (if gof assumptions are met as well) 

8
New cards

how to decide between x² contingency and fisher

  • choose fisher if x² test assumptions are violated 

9
New cards

which tests have sample assumptions and what are they

  • x² goodness of fit (probability distribution/poisson distribution)  & x² contingency

  • no more than 20% of categories with expected values < 5; no category with expected < 1; random sampling 

10
New cards

which tests use degrees of freedom and how do you calculate them?

x² goodness of fit test & x² contingency test 

gof: (# of categories) - (# parameters estimated from data) - 1

contingency: (# columns - 1) * (# rows - 1)

11
New cards

what are the null hypotheses for a binomial test, x² gof test, x² contingency test, and a fisher’s exact test?

binomial: the relative frequency of successes in the population is p0

(p = p0)

gof: the data come from a specified probability distribution (proportional, poisson, binomial)

contingency: the variables are independent 

fisher’s: the variables are independent

12
New cards

what are clear indicator to base a x² goodness of fit test on a poisson distribution and not some other distribution (proportional distribution)?

poisson 

  • numerical discrete data

  • being asked if events are randomly distributed in time or space 

13
New cards

what is x² (chi-squared statistic)

  • a single, non-negative number that quantifies the total discrepancy between a set of observed frequencies and the corresponding expected frequencies under a specific hypothesis

14
New cards

how does the use of x² compare and contrast in the goodness-of-fit test and the contingency test 

  • same formula, differs in purpose and calculation of expected frequencies 

  • gof: you are comparing your data to an external blueprint (the expected distribution) 

  • contingency: you are comparing your data to an internal standard (what would be expected if the two variables were unrelated, as derived from the table’s own margins) 

15
New cards

sample mean formula

knowt flashcard image

16
New cards

sample variance formula

knowt flashcard image

17
New cards

sample variance shortcut formula

knowt flashcard image

18
New cards

sample standard deviation

knowt flashcard image

19
New cards

sample standard deviation short cut

knowt flashcard image

20
New cards

coefficient of variation formula

knowt flashcard image

21
New cards

median

the middle observation (if the number of observations is even, average the middle two values)

22
New cards

sample proportion + formula

knowt flashcard image

23
New cards

how does adding a constant to all the measurements of a sample mean change it? 
how does multiplying all the measurements of a sample mean change it?

  • the constant is added to the sample mean 

  • the sample mean is multiplied by the constant 

24
New cards

how does adding a constant to all the measurements of a sample variance change it? 
how does multiplying all the measurements of a sample variance change it?

  • adding a constant does not change the variance 

  • for multiplication the variance is multiplied by the constant squared (s² * c²) 

25
New cards

how does adding a constant to all the measurements of a median change it? 
how does multiplying all the measurements of a median change it?

  • the constant is added to the median 

  • the median is multiplied by the median 

26
New cards

how does adding a constant to all the measurements of a IQR change it? 
how does multiplying all the measurements of a IQR change it?

  • the IQR is the same when adding 

  • the IQR is multiplied by the absolute of the constant (only positive) 

27
New cards

sample error of the mean + formula

measures the precision of the sample estimate (y-bar) of the population mean (mu) 

  • assumes that the sample is random 

  • we have to estimate this because we do not normally know the true standard deviation of Y in the population 

knowt flashcard image
  • s = sample standard deviation 

  • n = sample size

parameter: 

knowt flashcard image
  • but we do not know the standard deviation of the population 

<p>measures the precision of the sample estimate (y-bar) of the population mean (mu)&nbsp;</p><ul><li><p>assumes that the sample is random&nbsp;</p></li><li><p>we have to estimate this because we do not normally know the true standard deviation of Y in the population&nbsp;</p></li></ul><p></p><img src="https://knowt-user-attachments.s3.amazonaws.com/61754485-7670-4a4e-b55a-600e63385410.png" data-width="100%" data-align="center" alt="knowt flashcard image"><ul><li><p>s = sample standard deviation&nbsp;</p></li><li><p>n = sample size</p></li></ul><p></p><p>parameter:&nbsp;</p><img src="https://knowt-user-attachments.s3.amazonaws.com/53df249d-b2d7-47c7-ab3f-6587a7e7f28b.png" data-width="100%" data-align="center" alt="knowt flashcard image"><ul><li><p>but we do not know the standard deviation of the population&nbsp;</p></li></ul><p></p>
28
New cards

Pr[A and B] for mutually exclusive events

0

29
New cards

Pr[A or B] for mutually exclusive events

Pr[A or B] = Pr[A] + Pr[B]

  • addition rule 

30
New cards

Pr[A or B] for not mutually exclusive events

Pr[A or B] = Pr[A] + Pr[B] - [Pr A and B]

  • general addition rule 

31
New cards

Pr[A and B] for independent events

Pr[A and B] = Pr[A]Pr[B] 

  • multiplication rule 

32
New cards

Pr[A and B] for dependent events 

Pr[A and B] = Pr[A]Pr[B|A]

  • general multiplication rule 

33
New cards

Law of total probability formula

knowt flashcard image

34
New cards

binomial distribution probability formula

knowt flashcard image

(n X) = n! / (x!(n-x)!)

35
New cards

estimate of the standard error of a proportion formula

knowt flashcard image

36
New cards

agresti-coull 95% confidence interval for a proportion 

knowt flashcard image

p’ = (X + 2)/(n + 4) 

37
New cards

binomial test p-value formula

2*(sum of the observed probability and more extreme)

knowt flashcard image

38
New cards

what can you do to meet the assumptions of x²

combine categories

39
New cards

when is the null hypothesis rejected for x²

if the observed x² statistic exceeds the critical value of the x² distribution corresponding to a 

40
New cards

what does comparing the variance of the number of successes per block of time or space to the mean number of successes do?

measures the direction of departure from randomness in time and space

  • if the variance is greater than the mean, the successes are clumped

  • if the variance is less than the mean, success are more evenly distributed than expected by the poisson distribution 

41
New cards

formula for the x² test statistic 

knowt flashcard image

42
New cards

formula for the poisson distribution

(describes the number of independent events that occur per unit of time or space) 

knowt flashcard imageknowt flashcard image

<p>(describes the number of independent events that occur per unit of time or space)&nbsp;</p><img src="https://knowt-user-attachments.s3.amazonaws.com/0cda20c9-7da7-4ea8-9f7f-712b97226da5.png" data-width="100%" data-align="center" alt="knowt flashcard image"><img src="https://knowt-user-attachments.s3.amazonaws.com/4ec8b37e-c5c6-4d3f-9279-aa62c3b4e4bc.png" data-width="100%" data-align="center" alt="knowt flashcard image"><p></p>
43
New cards

odds of success + formula

probability of success divided by the probability of failure

44
New cards

the odds ratio

the odds of a focal outcome in one of two groups (treatment group) divided by the odds of that outcome in the second group (control)

  • used to quantify the magnitude of associations between two categorical variables, each of which has two categories 

45
New cards

z-standardization + formula

converts values from any normal distribution with known mean and known standard deviation into standard normal deviates 

knowt flashcard image

46
New cards

what are the steps to perform a binomial test? 

  1. state null and alternative 

  • H0: p = p0

  1. determine the number of successes

  • what will be used to compute a test statistic (proportion) that we will compare against the null expectation 

  1. calculate p-value using the binomial formula 

  • the probability of observing a result as extreme or more extreme than the observed statistic (X) assuming the null hypothesis is true

  • use the binomial formula to calculate the probability of the observed and more extreme for one of the tails then multiply by 2 

  1. draw conclusions 

  • based on if p is larger or smaller than alpha (0.05 if not mentioned) 

47
New cards

what are the steps to perform a x² goodness of fit test based on a probability distribution

  1. state null and alt 

  • null: frequency distribution of the observed data matches a specific probability model or set of expected proportions 

  1. calculate expected frequencies (E)

  • use the rule or probability model stated in the null to determine the expected frequency for each category 

knowt flashcard image
  • the sum of all expected frequencies must equal the sum of all observed frequencies 

  1. ensure that x² assumptions are true by checking expected frequencies 

  • if assumptions are not met combine categories to increase the expected frequencies and then proceed with the new set of categories 

  1. calculate the x² test statistic 

  • x² measures the total discrepancy between the observed counts and the expected counts 

knowt flashcard image
  1. determine the degrees of freedom

  • the df specify which x² theoretical distribution is used as the null distribution 

knowt flashcard image
  1. find the p-value and conclude

  • compare the calculated x² statistic to the x² distribution with the calculated df 

  • the p-value is the area in the right tail of the distribution

knowt flashcard image

48
New cards

what is the critical value (in terms of x²)

the threshold on the x² distribution that separates the fail to reject and reject region 

  1. choose the significance level 

  2. look up the values in a x² critical values table 

  • find the row corresponding to your df 

  • find the column corresponding to a

  • the intersection is the critical value

49
New cards

in an x² goodness of fit test (probability distribution) how does the critical value help us draw conclusions

since we only consider the right tail

  • anything that is smaller than the critical value is considered more likely to happen

    • fail to reject the null 

    • p larger than alpha 

  • anything that is larger than the critical value is considered less likely to happen 

    • reject the null 

    • p is smaller than alpha 

50
New cards

what are the steps to perform a x² goodness of fit test based on a poisson distribution?

  1. state the null and alt 

  • null: the number of successes per block of time or space has a poisson distribution 

  1. estimate mean rate of the poisson distribution 

  • since it is usually unknown, it must be estimated from your data 

  • calculate the sample mean of your observed count data. This value is used as the mu parameter in the poisson formula 

  1. calculate the expected frequencies using the poisson formula

knowt flashcard image
  1. check assumptions and combine categories if needed

knowt flashcard image

  1. calculate the x² test statistic

knowt flashcard image
  1. determine the degrees of freedom

  2. draw conclusions by comparing x² to the x² distribution with the determined df 

  • if calculated x² statistic exceeds the critical value from the x² table, reject the null and conclude that the data does not follow a poisson distribution

51
New cards

what are the steps to perform a x² contingency test? 

  1. state the null and alt

  • null: the two categorical variables are independent

  • alt: the two categorical variables are not independent 

  1. organize the observed data

  • arrange your sample data into a contingency table listing the observed frequencies for every combination of the two categorical variables 

  • count the number of rows and number of columns in your table 

  1. calculate the expected frequency for every cell in the table and check for assumptions 

knowt flashcard image
  1. calculate the x² test statistic

knowt flashcard image
  1. determine the degrees of freedom 

knowt flashcard image
  1. find the p-value and conclude 

knowt flashcard image

52
New cards

how does a fisher’s exact test work and what are the steps to perform in computationally?

  • an exact test unlike the x² contingency test which is an approximation 

  • has no minimum data requirements 

  • the test’s output allows you to determine if there is a statistically significant association (or lack of independence) between the two variables by providing a p-value 

  • the R function also provides the odds ratio as a measure of the strength of the association 

  1. create a frequency table 

  • test requires the data to be in a contingency table 

  • after reading in data use table() command to create this object

  • explanatory variable should come first 

knowt flashcard image
  1. (optional) set the order of the factor level

  • ensures the odds ratio is calculated with the probability of success in the numerator 

  • default is alphabetical

knowt flashcard image
  1. run the test 

  • use the fisher.test() function providing your created frequency table as the input

knowt flashcard image

  1. interpret the results

  • to get just odds ratio: sex_survive_fisher$estimate 

  • to get just the 95 confidence interval: sex_survive_fisher$conf.int

53
New cards

what are the steps to perform a x² contingency test computationally?

  1. create a frequency table using table()

  • explanatory variable first

knowt flashcard image
  1. check the x² test assumptions (all expected values are greater than 1 and at least 80% are greater than 5) by using $expected 

knowt flashcard image
  1. run the test using chisq.test() + correct = FALSE (to prevent a “Yate’s correction” which can be overly conservative)

knowt flashcard image
  1. interpret the results

  • returns x² value, df and p-value

54
New cards

what is the function to run the fisher’s test?

fisher.test(table) 

55
New cards

what is the function to run the x² test 

chisq.test(table, correct = FALSE)

56
New cards

code: changing the position of categories

factor(data$column, levels = c(“name1”, “name2))

57
New cards

levels()

shows you the categories in a column and its order

58
New cards

what should you add when reading in data (esp for a fisher or contingency test) 

stringsAsFactors = TRUE

59
New cards

code: creating a mosaic plot 

mosaicplot(table)

  • optional additions: colour = c(“colour1”, “colour2”), xlab = ““, ylab = ““

60
New cards

pulling odds ratio or confidence interval from a fisher test

object$estimate

object$conf.int

61
New cards

what are the steps to perform a x² goodness of fit test computationally? (probability and poisson)

  • x² goodness of fit test compares observed category frequencies to the frequencies predicted by a null hypothesis

  • either a null hypothesis specifies the probability or you estimate a parameter for your data (poisson distribution)

Case 1: probabilities given

  1. get observed frequencies using table()

knowt flashcard image
  1. define expected proportions

  • create a vector containing the expected proportions for each category as specified by your null hypothesis 

knowt flashcard image
  1. check the expected frequencies for test assumptions

  • find the total sample size: sum(MMtable)

  • calculate the expected frequencies: 55 * expected_proportions

  • combine categories to meet expectations 

  1. run the test by inputting expected_proportions to the chisq.test()

knowt flashcard image
  1. interpret results 

Case 2: Test with estimated parameters (poisson) 

  1. get the observed frequencies using table()

knowt flashcard image
  1. estimate parameters from the data

  • the null hypothesis is that the data follows a poisson distribution, but the mean is unknown; so we must estimate it from the data 

knowt flashcard image
  1. calculate expected probabilities

  • using the estimated parameter use the dpois() function to find the expected probability for each possible outcome 

knowt flashcard image
  1. calculate the expected frequencies and combine categories 

  • get the total sample size: length(column of interest)

  • calculate expected frequencies by multiplying the probabilities by the total sample size: # * expected_probability

  • check assumptions and combine

  • create new vectors for your combined observed and expected frequencies

  1. calculate the x² statistic using chisq.test()  and rescale.p = TRUE and $statistic

knowt flashcard image
  1. calculate degrees of freedom manually

  • # of categories - 1 - # of estimated parameters

  1. find the correct p-value using pchisq()

  • give it your x² statistic (q) and your manually calculated df and lower.tail = FALSE to get the p-value (the area in the right tail)

knowt flashcard image
  1. interpret the result

62
New cards

dpois()

dpois(x, lambda)

  • calculates the probability of getting x successes for a poisson distribution with a mean of lambda

knowt flashcard image
  • calculates the probability of 0, 1, 2, … 20 successes given the estimate mean.

63
New cards

sum() vs length()

sum() 

  • looks at the values inside a vector and adds them all together 

  • can help you get sample size 

eg.

  • sum(table(countries$continent))

    • table() creates the frequency table of counts

    • sum() adds those counts to get the sample size

length() 

  • looks at the items in a vector and tells you how many there are 

  • helps you determine how many categories 

eg.

  • length(countries$continent)

    • length() counts the number of elements in the vector

64
New cards

how does running a x² test differ between no poisson and poisson?

probability distribution (no estimated parameter): 

chisq.test(x = tablewithexpectedfrequencies, p = exp_proportions)

poisson distribution (estimated parameter): 

chisq.test(x = obs_frequencies, p = exp_frequencies, rescale.p = TRUE)$statistic

followed by 

pchisq(q = x²value, df = manuallydetermineddf, lower.tail = FALSE)

65
New cards

what are the steps to perform a binomial test computationally?

  1. identify the 3 required inputs of binom.test()

  • x: the number of observed successes

  • n: the total number of data points or trials

  • p: the proportion specified by your null hypothesis

  1. run the binom.test()

knowt flashcard image
  1. interpret the results based on p-value

66
New cards

what are the steps to create a confidence interval for a proportion computationally 

  1. load the binom package (to use the Agresti-coull method) 

  • library(binom) 

  1. run the binom.conf() function with method = “ac”

knowt flashcard image

67
New cards

code: confidence interval of mean

knowt flashcard image

68
New cards

code: interquartile range

knowt flashcard image

69
New cards

code: coefficient of variation 

knowt flashcard image

70
New cards

code: bar graph

knowt flashcard image

71
New cards

code: boxplot

knowt flashcard image

72
New cards

code: x² contingency analysis 

knowt flashcard imageknowt flashcard image

73
New cards

code: p-value from x² statistic

knowt flashcard image