good stats

CHAPTER 1: Descriptive Stats

Question: If a histogram is skewed right, where is the mean relative to the median?

Answer: Mean is greater than median (mean > median)

Question: If a histogram is skewed left, where is the mean relative to the median?

Answer: Mean is less than median (mean < median)

Question: Which is more resistant to outliers: mean or median?

Answer: Median

Question: What does a boxplot show you that a histogram doesn't?

Answer: Exact five-number summary (Min, Q1, Median, Q3, Max) and outliers

Question: Using the IQR method, a value is an outlier if it falls below what?

Answer: Q1 - 1.5 × IQR

Question: Using the IQR method, a value is an outlier if it falls above what?

Answer: Q3 + 1.5 × IQR

---

CHAPTER 2: Probability Concepts

Question: If two events are mutually exclusive, can they be independent?

Answer: No, not unless one has probability zero. Mutually exclusive means P(A∩B)=0, so P(A|B)=0 which does not equal P(A)

Question: What is the difference between mutually exclusive and independent events?

Answer: Mutually exclusive means events cannot happen together. Independent means one event does not affect the probability of the other

Question: Does P(A|B) equal P(B|A) in general?

Answer: No, only when P(A) equals P(B)

Question: If P(A)=0.5, P(B)=0.5, and P(A∩B)=0.25, are A and B independent?

Answer: Yes, because P(A)×P(B)=0.25 which equals P(A∩B)

Question: In a tree diagram, how do you find the probability of reaching a final branch?

Answer: Multiply the probabilities along all branches leading to that outcome

Question: What is the probability of neither A nor B occurring?

Answer: 1 minus P(A union B), which is 1 minus [P(A)+P(B)-P(A∩B)]

---

CHAPTER 3: Discrete Distribution Recognition

Question: You count the number of defective items in a sample of 10 drawn without replacement from a batch of 100. What distribution?

Answer: Hypergeometric (sampling without replacement from a finite population)

Question: You count the number of customers arriving at a store in one hour. What distribution?

Answer: Poisson (counts of events over time or space)

Question: You flip a coin 20 times and count the number of heads. What distribution?

Answer: Binomial (fixed number of trials, independent, constant probability)

Question: You flip a coin until you get your first head. What distribution?

Answer: Geometric (number of trials until first success)

Question: What is the key clue that tells you to use Poisson instead of Binomial?

Answer: No fixed number of trials; counting events over time/area rather than number of attempts

---

CHAPTER 4: Normal & Continuous Distributions

Question: For any continuous distribution, what is P(X equals an exact value)?

Answer: Zero. Only intervals have probability for continuous distributions

Question: What are the mean and standard deviation of the standard normal distribution?

Answer: Mean = 0, Standard deviation = 1

Question: About what percentage of data falls within 2 standard deviations of the mean in a Normal distribution?

Answer: Approximately 95%

Question: About what percentage of data falls within 1 standard deviation of the mean in a Normal distribution?

Answer: Approximately 68%

Question: What type of real-world phenomenon is often modeled by the exponential distribution?

Answer: Waiting times or lifetimes (time until next event occurs)

---

CHAPTER 5: Sampling & Central Limit Theorem

Question: As sample size increases, what happens to the sampling distribution of the sample mean?

Answer: It becomes more Normal (CLT) and narrower (smaller standard error)

Question: Does the Central Limit Theorem apply to the sample median?

Answer: No, the CLT applies specifically to the sample mean

Question: If the population is Normal, what is the shape of the sample mean's distribution for any sample size?

Answer: Normal (even for n=1)

Question: What is the standard deviation of the sample mean called?

Answer: Standard error

Question: If sample size increases, does the standard error increase or decrease?

Answer: Decrease (standard error = sigma divided by square root of n)

---

CHAPTER 6: Point Estimation Concepts

Question: What does unbiased mean in plain English?

Answer: On average, the estimator hits the true parameter value (no systematic overestimation or underestimation)

Question: Is the sample variance s-squared unbiased for the population variance sigma-squared?

Answer: Yes

Question: Is the sample standard deviation s unbiased for the population standard deviation sigma?

Answer: No (this is a common trick question)

Question: What does a consistent estimator mean?

Answer: As sample size increases, the estimator gets closer and closer to the true parameter value

---

CHAPTER 7: Confidence Interval Interpretation

Question: True or false: A 95% confidence interval means there is a 95% probability the parameter is in this specific interval.

Answer: False. The parameter is fixed. 95% refers to the long-run capture rate of the method.

Question: If you increase the confidence level from 90% to 99%, what happens to the interval width?

Answer: It gets wider

Question: If you increase the sample size, what happens to the confidence interval width?

Answer: It gets narrower

Question: A 95% confidence interval for the population mean is (10, 20). Does this mean 95% of the data falls between 10 and 20?

Answer: No. The confidence interval is about the mean, not individual data points.

Question: What is the difference between margin of error and interval width?

Answer: Interval width = 2 × Margin of Error

---

CHAPTER 8: Hypothesis Testing

Question: In plain English, what is a p-value?

Answer: The probability of observing data as extreme as (or more extreme than) what you got, assuming the null hypothesis is true

Question: A small p-value provides evidence for or against the null hypothesis?

Answer: Against the null hypothesis (reject H0)

Question: If you fail to reject the null hypothesis, does that mean the null hypothesis is true?

Answer: No. It means there is not enough evidence to conclude it is false.

Question: What is a Type I error?

Answer: False positive - rejecting the null hypothesis when it is actually true

Question: What is a Type II error?

Answer: False negative - failing to reject the null hypothesis when it is actually false

Question: Which type of error is controlled by the significance level alpha?

Answer: Type I error

---

CHAPTER 9: Two-Sample Inference

Question: Same subjects measured before and after a treatment — which test should you use?

Answer: Paired t-test

Question: Two separate groups (treatment vs control) with different subjects — which test should you use?

Answer: Two-sample t-test (independent samples)

Question: For a two-proportion z-test, what is the minimum sample size requirement?

Answer: At least 5 successes and 5 failures in each group

---

CHAPTER 12: Regression

Question: A slope of zero means what about the relationship between X and Y?

Answer: There is no linear relationship between X and Y

Question: If R-squared equals 0.64, what percentage of variation in Y is explained by X?

Answer: 64%

Question: If R-squared equals 0.64 and the slope is positive, what is the correlation r?

Answer: 0.8 (the square root of 0.64, with the same sign as the slope)

Question: Which is wider: a confidence interval for the mean response or a prediction interval for a new observation?

Answer: Prediction interval (it is always wider)

Question: What does the residual standard error estimate?

Answer: Sigma, the typical distance of points from the regression line

Question: If the slope is positive, what sign does the correlation r have?

Answer: Positive (same sign as the slope)

---

TRAP QUESTIONS (Most Commonly Missed)

Question: True or false: "We accept the null hypothesis" is correct statistical wording.

Answer: False. Always say "fail to reject the null hypothesis" — never "accept"

Question: True or false: A 95% confidence interval means 95% of sample means fall in this interval.

Answer: False. It means 95% of confidence intervals from repeated samples will contain the population mean.

Question: True or false: The sample standard deviation s is an unbiased estimator of the population standard deviation sigma.

Answer: False (this is a very common trick question)