1/25
Source: https://www.youtube.com/watch?v=GtV-VYdNt_g&list=PL8dPuuaLjXtNM_Y-bUAhblSAdWRnmBUcr&index=9
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What does a regression line in simple linear regression technically represent?
A) The line that visually looks closest to the data
B) The line that minimizes the sum of absolute errors
C) The line that minimizes the sum of squared residuals
D) The line with the steepest possible slope
Correct Answer: C) The line that minimizes the sum of squared residuals
Explanation:
The regression line is calculated using ordinary least squares (OLS), which minimizes the sum of squared residuals, not just visual closeness (A) or absolute errors (B).
If the slope of a regression line is 0.5 (in inches), what does this mean?
A) The son is always exactly half as tall as the father
B) For each additional inch of father’s height, son’s height increases by 0.5 inches on average
C) The variables are strongly correlated
D) The relationship is causal
Correct Answer: B) For each additional inch of father’s height, son’s height increases by 0.5 inches on average
Explanation:
The slope represents the average change in Y for a 1-unit increase in X. It does not imply perfect prediction (A), strength of correlation (C), or causation (D).
What happens to the slope if we change the units of measurement (e.g., inches to meters)?
A) It stays exactly the same
B) It becomes zero
C) It changes because slope depends on units
D) The correlation changes dramatically
Correct Answer: C) It changes because slope depends on units
Explanation:
Slope depends on measurement units. If units change, the slope changes. However, correlation does not change, because it is standardized.
If the correlation coefficient (r) equals 0, what does this imply?
A) There is no relationship at all
B) There is no linear relationship
C) The slope must be zero in the population
D) The variables are independent
Correct Answer: B) There is no linear relationship
Explanation:
r = 0 means no linear relationship, but a nonlinear relationship may still exist. It does not automatically mean independence.
Which of the following best describes what correlation measures?
A) Causation between two variables
B) The steepness of a regression line
C) The direction and strength of a linear relationship
D) The prediction accuracy of any model
Correct Answer: C) The direction and strength of a linear relationship
Explanation:
Correlation measures direction (positive/negative) and strength of a linear relationship. It does not measure causation or slope steepness.
Which statement about correlation is TRUE?
A) Correlation changes if units change
B) Correlation ranges from 0 to 1
C) Correlation is unit-free
D) A steep slope guarantees strong correlation
Correct Answer: C) Correlation is unit-free
Explanation:
Correlation is standardized using standard deviations, making it unit-free. It ranges from -1 to 1.
If r = -0.9, this indicates:
A) A strong positive linear relationship
B) A strong negative linear relationship
C) No relationship
D) A weak relationship
Correct Answer: B) A strong negative linear relationship
Explanation:
The sign indicates direction (negative = opposite movement), and magnitude close to 1 indicates strong relationship.
What does an R² value of 0.7 mean?
A) 70% of Y is caused by X
B) 70% of the variance in Y is explained by the linear model with X
C) The model predicts perfectly
D) 70% of observations lie exactly on the regression line
Correct Answer: B) 70% of the variance in Y is explained by the linear model with X
Explanation:
R² represents the proportion of variance explained by the linear model, not causation (A) or perfect prediction (C).
Why does correlation not imply causation?
A) Because correlation is always weak
B) Because a third variable may explain both variables
C) Because regression lines are unreliable
D) Because scatterplots are misleading
Correct Answer: B) Because a third variable may explain both variables
Explanation:
Correlation may occur due to:
A causing B
B causing A
A third variable causing both
Coincidence
If sample data shows a non-zero slope, what must we do before concluding a real relationship exists in the population?
A) Nothing; slope proves relationship
B) Check the steepness visually
C) Perform statistical significance testing
D) Change measurement units
Correct Answer: C) Perform statistical significance testing
Explanation:
A sample slope may occur due to random variation. Statistical testing (e.g., t-test for slope) is needed to infer a population relationship.
Why is it important to look at a scatterplot even if you know r?
A) Because r tells you nothing
B) Because different datasets can have the same r but very different shapes
C) Because scatterplots change correlation
D) Because R² cannot be calculated without it
Correct Answer: B) Because different datasets can have the same r but very different shapes
Explanation:
The “Datasaurus Dozen” demonstrates that datasets can share identical correlation values but have very different patterns. Visual inspection is crucial.
Which statement about prediction and R² is most accurate?
A) High R² guarantees perfect prediction
B) High R² means no residual error
C) Higher R² generally indicates better predictive fit, but residual variability still exists
D) R² proves causation
Correct Answer: C) Higher R² generally indicates better predictive fit, but residual variability still exists
Explanation:
Even with R² = 0.7, 30% of variance remains unexplained. Prediction is improved but not perfect.
When a scatterplot shows two distinct “blobs” or clusters of points, this most likely suggests:
A) A strong linear relationship
B) The presence of subgroups within the data
C) Perfect correlation
D) Measurement error
Correct Answer: B) The presence of subgroups within the data
Explanation:
Clusters often indicate different underlying groups or processes, as in the Old Faithful eruption example (short vs long eruptions). This may suggest the need for separate analyses.
Which of the following is TRUE regarding nonlinear relationships?
A) If r = 0, there is no relationship of any kind
B) Correlation only measures linear relationships
C) Nonlinear relationships always produce strong r values
D) Regression cannot model any nonlinear pattern
Correct Answer: B) Correlation only measures linear relationships
Explanation:
The Pearson correlation coefficient measures linear association only. A nonlinear relationship can exist even if r ≈ 0.
In a positively correlated scatterplot, most data points tend to fall in which quadrants (when divided by the means)?
A) Upper left and lower right
B) Upper right and lower left
C) Only upper right
D) Evenly in all quadrants
Correct Answer: B) Upper right and lower left
Explanation:
Positive correlation means both variables move together.
Upper right: both above average
Lower left: both below average
The relationship between hours asleep and hours awake (in a 24-hour day) is an example of:
A) Weak positive correlation
B) Perfect positive correlation
C) Perfect negative correlation
D) No correlation
Correct Answer: C) Perfect negative correlation
Explanation:
Because total time is fixed (24 hours), knowing one value allows exact prediction of the other. This creates perfect negative linear correlation (r = -1).
When researchers test many different subsets of data until they find a significant relationship, this increases the risk of:
A) Strong causation
B) Reduced variability
C) Spurious correlation due to multiple comparisons
D) Perfect prediction
Correct Answer: C) Spurious correlation due to multiple comparisons
Explanation:
Searching through many comparisons increases the probability of finding relationships purely by chance. This is sometimes called data dredging or p-hacking.
If a scatterplot shows a random cloud of points evenly distributed across quadrants, this most likely indicates:
A) Strong nonlinear relationship
B) Strong negative correlation
C) No linear relationship
D) Perfect prediction
Correct Answer: C) No linear relationship
Explanation:
When points are evenly scattered, correlation will be near zero, indicating no linear association.
The relationship between Celsius and Fahrenheit temperatures results in R² = 1 because:
A) They measure unrelated concepts
B) The relationship is nonlinear
C) One is a perfect linear transformation of the other
D) They are measured in the same units
Correct Answer: C) One is a perfect linear transformation of the other
Explanation:
Fahrenheit is a perfect linear transformation of Celsius. Therefore, 100% of the variance is explained, leading to R² = 1. This does not imply causation — it reflects mathematical conversion.
A non-zero regression coefficient (slope) indicates:
A) A strong relationship between variables
B) A causal relationship
C) Some degree of linear association in the sample
D) Perfect prediction
Correct Answer: C) Some degree of linear association in the sample
Explanation:
A non-zero slope suggests some linear association in the sample, but it does not indicate strength (that’s correlation), causation, or statistical significance.
Which statement best distinguishes slope (m) from correlation (r)?
A) Slope measures strength; correlation measures units
B) Slope depends on units; correlation does not
C) Slope ranges from -1 to 1; correlation does not
D) They always have the same numerical value
Correct Answer: B) Slope depends on units; correlation does not
Explanation:
Slope depends on the measurement units of X and Y.
Correlation is standardized and unit-free.
A stronger linear relationship is indicated when:
A) The regression line is steeper
B) Data points are tightly clustered around the regression line
C) The slope is positive
D) The intercept is large
Correct Answer: B) Data points are tightly clustered around the regression line
Explanation:
Strength of linear relationship depends on how closely points cluster around the line — not on steepness or intercept size.
Which of the following best describes R² as discussed conceptually?
A) The percentage of data points on the regression line
B) The proportion of variance explained by the linear model
C) The probability that X causes Y
D) The slope squared
Correct Answer: B) The proportion of variance explained by the linear model
Explanation:
R² represents the proportion of variance in the outcome explained by the predictor in a linear model.
If two datasets have identical slopes but one has much more scatter around the line, the dataset with more scatter will likely have:
A) Higher correlation
B) Lower correlation
C) Identical correlation
D) Perfect correlation
Correct Answer: B) Lower correlation
Explanation:
More scatter means weaker clustering around the line, leading to a lower correlation coefficient.
The relationship between car weight and gas efficiency is typically an example of:
A) Strong positive linear relationship
B) Strong negative linear relationship
C) Perfect correlation
D) No relationship
Correct Answer: B) Strong negative linear relationship
Explanation:
Heavier cars tend to have lower gas efficiency, representing a negative linear association.
If a study finds that watching action movies is correlated with speeding behavior, this type of evidence is best described as:
A) Experimental proof of causation
B) Observational correlation
C) Deterministic prediction
D) Mathematical identity
Correct Answer: B) Observational correlation
Explanation:
The study referenced is observational. Correlation alone does not establish causation without experimental control.