Regression Lines

  • residuals represent the length between the actual data and the line od best fit

  • when we say line of y on x, x needs to be not random, so in this case x on y is not possible because y is not random but depends on x

  • sum of residuals should be zero.

  • The least squares regression line is the line whcih produces the least possible vlue of the sum of the squares of the residuals

  • y-ybar =b(x-xbar) this is only for the equation of the regression line of y on x, or for when x is non-random and when y is random

  • The equation of the regression line of x on y is x-xbar =b(y-ybar) meaning x is to be predicted from a value of y

  • the goodness of fit of the regression line can be judged by examining the scatter diagram. The coefficient of determination - the square of r - gives an estimate of the proportion of variation in one variable that is explained by the variation in the other variable.

    e.g. if the r² value is 0.15, then only 15% of the variation of x is explained by the variation in the values of y.

  • If the scatter diagram shows an approximately elliptical distribution, then both variables are random and have a bivariate normal distribution