VG

Chapter 12 Notes: Association Between I-R Variables & Linear Regression (Part 1)

Chapter 12: Association Between I-R Variables & Linear Regression (Part 1)

Chapter Outline

  • Introduction
  • Scattergram
  • Regression and Prediction
  • Find the Regression Line: Y = a + bX; where a is the Y-intercept and b is the slope
  • Pearson’s r (correlation)
  • Interpreting r^2
  • Other Issues in Regression Analysis

Key Concepts

  • Scattergram (or a scatter plot): A graph that displays the relationship between two interval-ratio variables.
  • Regression Line: Summarizes the linear relationship between two I-R variables X and Y. It predicts a score on Y from a score on X.
  • Pearson’s r (correlation): A measure of association for interval-ratio variables.

Example: Regression Analysis

Scenario

  • Goal: Determine the relationship between hours of study and exam grade in statistics.
  • Method: A random sample of 16 students is drawn from a large statistics class.

Data

The information of the sample is show in the following table:

CaseHours of StudyExam Grade
1564
2152
3676
4371
5474
6981
71180
81483
9269
10156
11779
12493
13691
14385
15888
161296

Scattergram Details

  • Axes:
    • The independent variable (X) is arrayed along the horizontal axis.
    • The dependent variable (Y) is arrayed along the vertical axis.
  • Data Points:
    • Each dot on a scattergram represents a case.
    • The dot is placed at the intersection of the case’s scores on X and Y.

Scattergram Example

  • This scattergram shows the relationship between “hours of study” (X) and “exam grade” (Y) for the 16 students.
  • X axis (horizontal) is “hours of study.”
    • Scores range from 1 to 14.
  • Y axis (vertical) is “exam grade.”
    • Scores range from 52 to 96.

Scattergram -- Correlation Patterns

  • The greater the extent to which dots are clustered around a straight line, the stronger the correlation.
  • Positive Correlation: X increases, Y increases.
  • Negative Correlation: X increases, Y decreases.

Estimation of Y

  • What is the best way to estimate the dependent variable, Y?
    • Mean value?
    • Predictions based on the regression line?

Regression Line

  • A single straight line that comes as close as possible to all data points.
  • That line minimizes:
    \sum(Y - \hat{Y})^2
  • The line passes through the point:
    (\bar{X}, \bar{Y})

Direction of Regression Line

  • A positive correlation: regression line rises from left to right.
  • A negative correlation: regression line falls from left to right.