BIOL 300 Midterm Memorize

0.0(0)

Studied by 0 people

0.0(0)

Call with Kai

Knowt Play

New

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/72

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

73 Terms

New cards

conditions for binomial test

test whether a population proportion (p) matches a null expectation for the proportion
one variable
categorical variable
two categories (success, fail)

New cards

what is the flow chart for deciding which test to use?

How many variables?

ONE
- categorical
  - 2 categories
    - BINOMIAL TEST
  - 2+ categories
    - x² GOF - PROBABILITY DISTRIBUTION
- numerical
  - x² GOF - POISSON DISTRIBUTION
TWO
- sample size assumption met
  - x² CONTINGENCY TEST
- sample size assumption not met
  - FISHER’S EXACT TEST

New cards

conditions for x² goodness of fit test

use frequency data to test whether a population proportion equals a null hypothesized proportion that comes from a specified probability distribution
use if sample size is too large for binomial test (can be very tedious to calculate the observed and more extreme many times; but less precise)
one variable
categorical or discrete numerical variable
no more than 20% of categories with expected < 5 & no category with expected < 1 & random sampling

New cards

what are the conditions for the x² contingency test

want to test the association of two or more categorical variables
two or more variables
categorical variables
no more than 20% of categories with expected < 5; no category with expected < 1; random sampling

New cards

what are the conditions for fisher’s exact test?

want to test the association of two categorical variables by calculating the exact probability of the sample under the null hypothesis (useful when the assumptions of x² test are violated)
two variables
categorical variables
random sampling

New cards

what are the test statistics for the binomial test, x² goodness of fit test, x² contingency test and the fisher’s exact test

(test statistic: value(s) calculated from the sample that are relevant to testing the null hypothesis)

binomial: p-hat (sample proportion) or x (number of successes of n trials)

x is used to calculate p-hat

x² gof: x²

x² contingency test: x²

fisher’s exact test: none (not really done by hand)

New cards

how to decide over binomial test and x² gof test

choose binomial when gof assumption not met
choose gof if too tedious to calculate probabilities of binomial (if gof assumptions are met as well)

New cards

how to decide between x² contingency and fisher

choose fisher if x² test assumptions are violated

New cards

which tests have sample assumptions and what are they

x² goodness of fit (probability distribution/poisson distribution) & x² contingency
no more than 20% of categories with expected values < 5; no category with expected < 1; random sampling

New cards

which tests use degrees of freedom and how do you calculate them?

x² goodness of fit test & x² contingency test

gof: (# of categories) - (# parameters estimated from data) - 1

contingency: (# columns - 1) * (# rows - 1)

New cards

what are the null hypotheses for a binomial test, x² gof test, x² contingency test, and a fisher’s exact test?

binomial: the relative frequency of successes in the population is p0

(p = p0)

gof: the data come from a specified probability distribution (proportional, poisson, binomial)

contingency: the variables are independent

fisher’s: the variables are independent

New cards

what are clear indicator to base a x² goodness of fit test on a poisson distribution and not some other distribution (proportional distribution)?

poisson

numerical discrete data
being asked if events are randomly distributed in time or space

New cards

what is x² (chi-squared statistic)

a single, non-negative number that quantifies the total discrepancy between a set of observed frequencies and the corresponding expected frequencies under a specific hypothesis

New cards

how does the use of x² compare and contrast in the goodness-of-fit test and the contingency test

same formula, differs in purpose and calculation of expected frequencies
gof: you are comparing your data to an external blueprint (the expected distribution)
contingency: you are comparing your data to an internal standard (what would be expected if the two variables were unrelated, as derived from the table’s own margins)

New cards

sample mean formula

New cards

sample variance formula

New cards

sample variance shortcut formula

New cards

sample standard deviation

New cards

sample standard deviation short cut

New cards

coefficient of variation formula

New cards

median

the middle observation (if the number of observations is even, average the middle two values)

New cards

sample proportion + formula

New cards

how does adding a constant to all the measurements of a sample mean change it?
how does multiplying all the measurements of a sample mean change it?

the constant is added to the sample mean
the sample mean is multiplied by the constant

New cards

how does adding a constant to all the measurements of a sample variance change it?
how does multiplying all the measurements of a sample variance change it?

adding a constant does not change the variance
for multiplication the variance is multiplied by the constant squared (s² * c²)

New cards

how does adding a constant to all the measurements of a median change it?
how does multiplying all the measurements of a median change it?

the constant is added to the median
the median is multiplied by the median

New cards

how does adding a constant to all the measurements of a IQR change it?
how does multiplying all the measurements of a IQR change it?

the IQR is the same when adding
the IQR is multiplied by the absolute of the constant (only positive)

New cards

sample error of the mean + formula

measures the precision of the sample estimate (y-bar) of the population mean (mu)

assumes that the sample is random
we have to estimate this because we do not normally know the true standard deviation of Y in the population

s = sample standard deviation
n = sample size

parameter:

but we do not know the standard deviation of the population

<p>measures the precision of the sample estimate (y-bar) of the population mean (mu) </p><ul><li><p>assumes that the sample is random </p></li><li><p>we have to estimate this because we do not normally know the true standard deviation of Y in the population </p></li></ul><p></p><img src="https://knowt-user-attachments.s3.amazonaws.com/61754485-7670-4a4e-b55a-600e63385410.png" data-width="100%" data-align="center" alt="knowt flashcard image"><ul><li><p>s = sample standard deviation </p></li><li><p>n = sample size</p></li></ul><p></p><p>parameter: </p><img src="https://knowt-user-attachments.s3.amazonaws.com/53df249d-b2d7-47c7-ab3f-6587a7e7f28b.png" data-width="100%" data-align="center" alt="knowt flashcard image"><ul><li><p>but we do not know the standard deviation of the population </p></li></ul><p></p>

New cards

Pr[A and B] for mutually exclusive events

New cards

Pr[A or B] for mutually exclusive events

Pr[A or B] = Pr[A] + Pr[B]

addition rule

New cards

Pr[A or B] for not mutually exclusive events

Pr[A or B] = Pr[A] + Pr[B] - [Pr A and B]

general addition rule

New cards

Pr[A and B] for independent events

Pr[A and B] = Pr[A]Pr[B]

multiplication rule

New cards

Pr[A and B] for dependent events

Pr[A and B] = Pr[A]Pr[B|A]

general multiplication rule

New cards

Law of total probability formula

New cards

binomial distribution probability formula

(n X) = n! / (x!(n-x)!)

New cards

estimate of the standard error of a proportion formula

New cards

agresti-coull 95% confidence interval for a proportion

p’ = (X + 2)/(n + 4)

New cards

binomial test p-value formula

2*(sum of the observed probability and more extreme)

New cards

what can you do to meet the assumptions of x²

combine categories

New cards

when is the null hypothesis rejected for x²

if the observed x² statistic exceeds the critical value of the x² distribution corresponding to a

New cards

what does comparing the variance of the number of successes per block of time or space to the mean number of successes do?

measures the direction of departure from randomness in time and space

if the variance is greater than the mean, the successes are clumped
if the variance is less than the mean, success are more evenly distributed than expected by the poisson distribution

New cards

formula for the x² test statistic

New cards

formula for the poisson distribution

(describes the number of independent events that occur per unit of time or space)

<p>(describes the number of independent events that occur per unit of time or space) </p><img src="https://knowt-user-attachments.s3.amazonaws.com/0cda20c9-7da7-4ea8-9f7f-712b97226da5.png" data-width="100%" data-align="center" alt="knowt flashcard image"><img src="https://knowt-user-attachments.s3.amazonaws.com/4ec8b37e-c5c6-4d3f-9279-aa62c3b4e4bc.png" data-width="100%" data-align="center" alt="knowt flashcard image"><p></p>

New cards

odds of success + formula

probability of success divided by the probability of failure

New cards

the odds ratio

the odds of a focal outcome in one of two groups (treatment group) divided by the odds of that outcome in the second group (control)

used to quantify the magnitude of associations between two categorical variables, each of which has two categories

New cards

z-standardization + formula

converts values from any normal distribution with known mean and known standard deviation into standard normal deviates

New cards

what are the steps to perform a binomial test?

state null and alternative

H0: p = p0

determine the number of successes

what will be used to compute a test statistic (proportion) that we will compare against the null expectation

calculate p-value using the binomial formula

the probability of observing a result as extreme or more extreme than the observed statistic (X) assuming the null hypothesis is true
use the binomial formula to calculate the probability of the observed and more extreme for one of the tails then multiply by 2

draw conclusions

based on if p is larger or smaller than alpha (0.05 if not mentioned)

New cards

what are the steps to perform a x² goodness of fit test based on a probability distribution

state null and alt

null: frequency distribution of the observed data matches a specific probability model or set of expected proportions

calculate expected frequencies (E)

use the rule or probability model stated in the null to determine the expected frequency for each category

the sum of all expected frequencies must equal the sum of all observed frequencies

ensure that x² assumptions are true by checking expected frequencies

if assumptions are not met combine categories to increase the expected frequencies and then proceed with the new set of categories

calculate the x² test statistic

x² measures the total discrepancy between the observed counts and the expected counts

determine the degrees of freedom

the df specify which x² theoretical distribution is used as the null distribution

find the p-value and conclude

compare the calculated x² statistic to the x² distribution with the calculated df
the p-value is the area in the right tail of the distribution

New cards

what is the critical value (in terms of x²)

the threshold on the x² distribution that separates the fail to reject and reject region

choose the significance level
look up the values in a x² critical values table

find the row corresponding to your df
find the column corresponding to a
the intersection is the critical value

New cards

in an x² goodness of fit test (probability distribution) how does the critical value help us draw conclusions

since we only consider the right tail

anything that is smaller than the critical value is considered more likely to happen
- fail to reject the null
- p larger than alpha
anything that is larger than the critical value is considered less likely to happen
- reject the null
- p is smaller than alpha

New cards

what are the steps to perform a x² goodness of fit test based on a poisson distribution?

state the null and alt

null: the number of successes per block of time or space has a poisson distribution

estimate mean rate of the poisson distribution

since it is usually unknown, it must be estimated from your data
calculate the sample mean of your observed count data. This value is used as the mu parameter in the poisson formula

calculate the expected frequencies using the poisson formula

check assumptions and combine categories if needed

calculate the x² test statistic

determine the degrees of freedom
draw conclusions by comparing x² to the x² distribution with the determined df

if calculated x² statistic exceeds the critical value from the x² table, reject the null and conclude that the data does not follow a poisson distribution

New cards

what are the steps to perform a x² contingency test?

state the null and alt

null: the two categorical variables are independent
alt: the two categorical variables are not independent

organize the observed data

arrange your sample data into a contingency table listing the observed frequencies for every combination of the two categorical variables
count the number of rows and number of columns in your table

calculate the expected frequency for every cell in the table and check for assumptions

calculate the x² test statistic

determine the degrees of freedom

find the p-value and conclude

New cards

how does a fisher’s exact test work and what are the steps to perform in computationally?

an exact test unlike the x² contingency test which is an approximation
has no minimum data requirements
the test’s output allows you to determine if there is a statistically significant association (or lack of independence) between the two variables by providing a p-value
the R function also provides the odds ratio as a measure of the strength of the association

create a frequency table

test requires the data to be in a contingency table
after reading in data use table() command to create this object
explanatory variable should come first

(optional) set the order of the factor level

ensures the odds ratio is calculated with the probability of success in the numerator
default is alphabetical

run the test

use the fisher.test() function providing your created frequency table as the input

interpret the results

to get just odds ratio: sex_survive_fisher$estimate
to get just the 95 confidence interval: sex_survive_fisher$conf.int

New cards

what are the steps to perform a x² contingency test computationally?

create a frequency table using table()

explanatory variable first

check the x² test assumptions (all expected values are greater than 1 and at least 80% are greater than 5) by using $expected

run the test using chisq.test() + correct = FALSE (to prevent a “Yate’s correction” which can be overly conservative)

interpret the results

returns x² value, df and p-value

New cards

what is the function to run the fisher’s test?

fisher.test(table)

New cards

what is the function to run the x² test

chisq.test(table, correct = FALSE)

New cards

code: changing the position of categories

factor(data$column, levels = c(“name1”, “name2))

New cards

levels()

shows you the categories in a column and its order

New cards

what should you add when reading in data (esp for a fisher or contingency test)

stringsAsFactors = TRUE

New cards

code: creating a mosaic plot

mosaicplot(table)

optional additions: colour = c(“colour1”, “colour2”), xlab = ““, ylab = ““

New cards

pulling odds ratio or confidence interval from a fisher test

object$estimate

object$conf.int

New cards

what are the steps to perform a x² goodness of fit test computationally? (probability and poisson)

x² goodness of fit test compares observed category frequencies to the frequencies predicted by a null hypothesis
either a null hypothesis specifies the probability or you estimate a parameter for your data (poisson distribution)

Case 1: probabilities given

get observed frequencies using table()

define expected proportions

create a vector containing the expected proportions for each category as specified by your null hypothesis

check the expected frequencies for test assumptions

find the total sample size: sum(MMtable)
calculate the expected frequencies: 55 * expected_proportions
combine categories to meet expectations

run the test by inputting expected_proportions to the chisq.test()

interpret results

Case 2: Test with estimated parameters (poisson)

get the observed frequencies using table()

estimate parameters from the data

the null hypothesis is that the data follows a poisson distribution, but the mean is unknown; so we must estimate it from the data

calculate expected probabilities

using the estimated parameter use the dpois() function to find the expected probability for each possible outcome

calculate the expected frequencies and combine categories

get the total sample size: length(column of interest)
calculate expected frequencies by multiplying the probabilities by the total sample size: # * expected_probability
check assumptions and combine
create new vectors for your combined observed and expected frequencies