Explain the difference between ratio, interval, ordinal, and nominal scales of measurement.
Nominal: Categories without any order (e.g., gender, eye color).
Ordinal: Ordered categories with no consistent differences between values (e.g., rankings).
Interval: Ordered categories with consistent intervals but no true zero (e.g., temperature in Celsius).
Ratio: Like interval, but with a true zero point (e.g., weight, height).
Explain how skew and kurtosis affect measurements of central tendency and/or measures of variability.
Skew: Affects the symmetry of the distribution, shifting the mean away from the median.
Kurtosis: Refers to the 'tailedness' of the distribution, influencing the variability in extreme values.
What is variance? What is the difference between a standard deviation and a standard error of the mean? Explain what a sampling distribution is in your answer.
Variance: Measure of the average squared differences from the mean.
Standard Deviation: Square root of variance, indicating data spread.
Standard Error: Measures the variability of the sample mean relative to the population mean.
Sampling Distribution: The distribution of sample means over repeated sampling from the population.
Why does sample variance underestimate the true population variance?
Sample variance underestimates because it divides by N instead of N-1, which leads to missing some variability in small samples.
Explain why N-1 is used as the denominator for sample variance.
N-1 corrects for bias in estimating the population variance from a sample, known as Bessel’s correction.
Explain what is meant by “efficient,” “unbiased,” “sufficient,” and “resistant” when referring to estimators.
Efficient: Smallest variance among unbiased estimators.
Unbiased: Expected value equals the true population parameter.
Sufficient: Uses all the data to estimate the parameter.
Resistant: Not influenced by outliers.
How does sample size affect efficiency? What about the standard error of the mean?
Larger sample sizes increase efficiency and reduce the standard error of the mean.
Why can you use the normal distribution (z) to estimate a binomial distribution? Why should you not use the normal distribution to estimate the binomial distribution when you have small sample sizes?
You can use the normal distribution for binomial data when sample size is large due to the Central Limit Theorem.
You should not use it for small samples as binomial data may not be symmetric.
What is a null hypothesis? Explain the logic of null hypothesis significance testing.
The null hypothesis states there is no effect or difference. Null hypothesis significance testing checks if data provide enough evidence to reject it.
What is a p-value? How do p-values differ from effect sizes? Do sample sizes affect p-values? What about effect sizes?
P-value: Probability of observing the data assuming the null hypothesis is true.
Effect Size: Measures the magnitude of a difference.
Sample size affects p-values but not effect sizes.
What is the purpose of standardizing a distribution using the z-formula? What do the mean and standard deviation become when you standardize a distribution?
Standardizing allows comparisons across different distributions.
The mean becomes 0, and the standard deviation becomes 1.
Why do data need to be normally distributed when conducting a z-test or a t-test?
Z-tests and t-tests assume normality for accurate p-values and test validity.
Explain the similarities and/or differences between the z and t distributions. In what instance are the z and t distributions the same?
Both are similar, but the t-distribution has heavier tails, used for smaller sample sizes.
They are the same when the sample size is large (infinite degrees of freedom).
Why do the critical values for a t-test change? Unlike z-tests, where the critical cut-offs are either 1.64 or 1.96 at an alpha of 0.05.
T-distribution critical values change with sample size due to the increased uncertainty in small samples.
What is the difference between a one-sample t-test and a z-test? If you are in a situation where either test is possible to conduct, which would you choose and why?
T-test: Used when population standard deviation is unknown.
Z-test: Used when population standard deviation is known.
Choose the t-test if population variance is unknown.
What are the assumptions for a one-sample t-test? Why are these assumptions necessary? How would violations to these assumptions affect the interpretation of the results of a t-test?
Normality and independence are necessary for accurate p-values. Violations can lead to incorrect conclusions.
Explain the relationship between Type I errors, Type II errors, confidence level, and power. How do you calculate the probability of occurrence for each of these elements in an experiment?
Type I Error (α): Rejecting a true null hypothesis.
Type II Error (β): Failing to reject a false null hypothesis.
Confidence Level: 1 - α.
Power: 1 - β, or the ability to detect an effect.
If you did not adjust the critical values for a t-test (that is, you used the same critical values as a z-test – 1.64 or 1.96), would this affect the likelihood of committing a Type I or Type II Error? Explain why or why not.
Yes, it would increase the likelihood of Type I errors because t-distribution critical values for small samples are higher.
What is the purpose of a power analysis? Explain the ways in which power can be increased or improved.
Power analysis determines the required sample size to detect an effect.
Power can be improved by increasing sample size, effect size, or alpha level.
How would small sample sizes affect a t-test and its assumptions? What about outliers?
Small samples increase variability and reduce efficiency.
Outliers can disproportionately affect results in small samples.