1/25
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is the mean in statistics?
The mean is the arithmetic average of a dataset, calculated by summing all data points and dividing by the total number of observations.
How is the mean sensitive to outliers?
A few extreme values can skew the mean significantly.
What is the median?
The median is the middle value of a sorted dataset or the average of the two middle values if the dataset has an even number of observations.
Why is the median a better measure of central tendency in skewed distributions?
The median is robust to outliers and is less influenced by extreme values.
What does mode represent in a dataset?
The mode is the value that appears most frequently in a dataset.
In what scenarios is the mode particularly useful?
The mode is useful for categorical data where mean and median may not make sense.
What does variance measure?
Variance measures the average squared deviation from the mean, quantifying the spread or dispersion in the dataset.
What does a high variance indicate?
High variance indicates that data points are spread out widely around the mean.
What is the standard deviation?
Standard deviation is the square root of variance, which puts it back in the same unit as the original data.
In a normal distribution, what percentage of values lie within ±1 standard deviation from the mean?
Approximately 68% of values lie within ±1 standard deviation from the mean.
What is the Central Limit Theorem (CLT)?
The CLT states that as sample size grows, the sampling distribution of the sample mean approaches a normal distribution, regardless of the population's original distribution.
What does the Law of Large Numbers state?
The LLN states that as the number of trials or observations increases, the sample average will converge to the population mean.
What distinguishes Bayesian statistics from Frequentist statistics?
In Bayesian statistics, parameters are treated as random variables and are updated based on observed data, while in Frequentist statistics, parameters are fixed and data is random.
What is Maximum Likelihood Estimation (MLE)?
MLE is a method of estimating model parameters by finding values that maximize the likelihood function.
What is a confidence interval?
A confidence interval gives a range of plausible values for an unknown population parameter based on sample data.
What does a 95% confidence interval imply?
If the process is repeated many times, approximately 95% of those intervals would contain the true parameter value.
What is the purpose of hypothesis testing?
Hypothesis testing evaluates competing claims using observed data.
What does a t-test compare?
A t-test compares means between two groups.
What does the p-value indicate?
The p-value indicates how extreme the observed result is if the null hypothesis were true.
What is the difference between correlation and causation?
Correlation measures the degree to which two variables move together, but does not imply that one causes the other.
What is covariance?
Covariance measures joint variability between two variables but is scale-dependent.
How does correlation differ from covariance?
Correlation standardizes covariance, producing values between -1 and 1, making it easier to interpret strength and direction of relationships.
What is bootstrapping?
Bootstrapping is resampling with replacement to estimate the sampling distribution of a statistic.
What is stratified sampling?
Stratified sampling involves dividing the population into subgroups and sampling proportionally to improve estimate accuracy.
What is systematic sampling?
Systematic sampling involves selecting every nth member from a list of the population.
What is cluster sampling?
Cluster sampling involves dividing the population into clusters and randomly selecting entire clusters to sample.