1/26
Sadistic Torture Across Two Semesters
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is a population?
Group of individuals we wish to study (eg. all students at WVC).
What is a parameter?
Either a population proportion or population mean. Represented by p (proportion) or μ (mean). eg. proportion of all WVC students who work part-time.
What is a census?
Example of surveying entire population. However, this is usually unrealistic because the population is too big.
What is a sample?
Collection of individuals taken from population of interest.
What is a statistic?
Number of “yeses” / sample size. You have to have calculated something from the sample, like median or sample proportion. Represented by p-hat.
Which value can never be found, statistic or parameter?
Parameter. You can find the value of a statistic by collecting data, but you can only make inferences about the parameter (generalizations about the whole population).
What is sampling bias?
You get a sample that doesn’t accurately represent the whole population. eg. you survey a very opinionated population
What is voluntary-response bias?
Sampling bias where people only respond if they feel strongly about the results (miku fans will respond to surveys about how much they like miku. non-miku fans won’t fill it out at all)
What is nonresponse bias?
People asked to do the survey refuse to fill it out. (might ask for uncomfortable info.)
How do you get a random number from the calculator?
MATH → PRB → 5 → randInt (1, population size, amount of results you want)
What is the difference between accuracy and precision?
Accurate → how close you are to the target value → measured by how unbiased you are → fixed by getting a random sample
Precise → how close the values are to one another → measured by size of standard error (the smaller the better) → fixed by getting a larger sample
How do you calculate the population proportion based on the sampling distribution?
Mean of sampling distribution (p-hat) is equal to population proportion.
Key to all letters used in stats?
p = population proportion, yeses / total population. Usually a decimal like 0.6. → alternatively, in 1-PropZTest (hypothesis test), it represents p-value.
p̂ = sample proportion, yeses / sample size. Also called p-hat. Usually a decimal like 0.6
x = amount of “yeses”
n = total sample size
Po = assumed population proportion in the null hypothesis. This is what you guess before you do the experiment.
H0 = null hypothesis
HA = alternative hypothesis
μ = mean
x-bar = sample mean
σ = population standard deviation
s = sample standard deviation
α = significance level
Standard deviation vs. standard error?
Standard deviation = variation of your sample around the mean
Standard error = you take the mean of a bunch of samples. Then you calculate the standard deviation of those means.
Criteria for CLT (Central Limit Theorem)?
CLT tells you if distribution is Normal, if it is, only then can you run tests on it
Random sample
Large sample (at least 10 yeses and 10 nos)
Large population (at least 10x sample size)
What is a confidence interval?
Interval containing the most plausible values for a parameter. Written like (point estimate) ± (margin of error).
What is margin of error?
z-score * standard error
Multiply the following SE by z-score to get margin of error:
99% confidence level = 2.58 standard errors
95% confidence interval = 1.96 (shortcut method: 2) standard errors
90% confidence interval = 1.645 standard errors
80% = 1.28 standard errors
What is a confidence level?
Probability that the confidence intervals created with this process contain the true parameter. Basically: if I create a bunch of confidence intervals, what % of them capture the true value?
Does NOT apply to a single confidence interval. That one either captures the true value (100%) or doesn’t (0%). Confidence level, on the other hand, is usually 0.95.
What should you keep in mind for sample size?
Always round up to the nearest whole number
Which proportions can be used to draw conclusions?
Population proportion. Never sample proportion.
How do you interpret a confidence interval for two populations?
(+,+) → Population 1 is significantly larger
(-,-) → Population 2 is significantly larger
(-,+) → No significant difference between populations (contains 0)
What is a significance level?
How okay you are with making a mistake. Usually 0.05, given by alpha (α).
What is a test statistic?
How many standard errors the observed proportion is above/below the null hypothesis. The higher it is, the more evidence you have against the null. Represented by z (like z-score).
Only use if the data passes CLT.
What is a p-value?
How likely the data is to be the same as expected. Represented by p.
Small p-value → large z-test statistic → data isn’t likely to be the same as expected → reject the null
Large p-value → small z-test statistic → data is pretty likely to be the same as expected → don’t reject the null
What is the relationship between p-value and significance level?
p < significance level → enough evidence to reject the null
p > significance level → not enough evidence. don’t reject the null
What are the 4 steps for hypothesis testing?
Write the null and alternative hypotheses
Choose a significance level and check CLT
Run 1-PropZTest and find the z (test statistic) and p (p-value).
Interpret that you either reject or fail to reject the null hypothesis
What are the “tailed” tests?
Right tailed test: Result is bigger than expected (p > Po). The right part of the normal curve is shaded, representing the p-value.
Two-tailed test: Result is not equal to what is expected (p ≠ Po). The p-value is doubled, and is shaded on the end of both sides of the normal curve.
Left-tailed test: Result is smaller than expected (p < Po). The left part of the normal curve is shaded, representing the p-value.