Section 2.5 Cautions about Correlation and Regression

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/7

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

8 Terms

1
New cards

Residuals:

  • A regression line describes the overall pattern of a linear relationship between an explanatory variable and a response variable

  • Deviations from the overall pattern are also important. The vertical distances between the points and the least-squares regression line are called residuals

  • A residual is the difference between an observed value of the response variable and the value predicted by the regression line:

  • residual = observed y — predicted y = y = y hat

2
New cards

Residual Plots:

  • A residual plot is a scatterplot of the regression residuals against the explanatory variable

  • Residual plots help us assess the fit of a regression line

    • Ideally there should be a “random” scatter around zero

    • Residual patterns suggest deviations from a linear relationship

3
New cards

Things to Look for in Residual Plots:

  • A curved pattern: this demonstrates that the relationship may not be linear

  • Increasing or decreasing spread about the line: this indicates that the prediction of y may be less precise for larger x

  • Individual points with large residuals: these may indicate some possible outliers in the y direction

  • Individual points that are extreme in the x direction: these may seem normal from the perspective of residuals, but may have great influence on the LSRL equation

4
New cards

Outliers and Influential Points:

  • An outlier is an observation that lies outside the overall pattern of the other observations

    • Outliers in the y direction have large residuals

    • Outliers in the x direction are often influential for the least-squares regression line, meaning that the removal of such points would markedly change the equation of the line

5
New cards

IMPORTANT REMINDER:

  • Correlation and least-square regression are not
    resistant

  • Always plot your data and look for outliers and
    potentially influential points

6
New cards

Lurking Variables:

  • A lurking variable is a variable that is not among the explanatory or
    response variables and yet may influence the interpretation of
    relationships among those variables

7
New cards

Cautions about Correlation and Regression:

  • Both describe linear relationships

  • Both are affected by outliers

  • Always plot the data before interpreting

  • Beware of extrapolation

    • Use caution in predicting y when x is outside the range
      of observed x’s.

  • Beware of lurking variables

    • These have an important effect on the relationship
      among the variables in a study but are not included in
      the study.

  • Correlation does not imply causation!

    • When you observe an association between two
      variables, always ask yourself if the relationship that
      you see might be due to a lurking variable


8
New cards