Psych 300A: Midterm 2 Review (Regression)

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/43

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

44 Terms

1
New cards

Regression

The straight line that fits the data best, can be expressed mathematically

2
New cards

What does the regression line let us do with respect to X and Y

We can predict unknown Y scores from given X scores (assuming scores fall in data range)

3
New cards

How is regression calculated

Y’ = ay + byx (sometimes Ŷ is also used)

4
New cards

Best fit line

Line that makes predictions about people’s beliefs that are as close to the true scores as possible

5
New cards

Error

Difference between a person’s predicted score and the person’s actual score on the criterion variable

6
New cards

What is another name for the regression line

Least squares regression, because it mathematically minimizes errors associated with trying to predict info

7
New cards

How do we know if a line minimizes errors

We take the sum of squared errors ( ∑(Y - Y’)2 ), which is a minimum on the best fitting line

8
New cards

by

Slope, constant value that dictates the proportional change in Y given a value of X

9
New cards

How is by calculated

by = r(SDy/SDx)

10
New cards

ay

Y intercept, when predicting Y it is the point where the regression line intercepts with the Y-axis

11
New cards

How is ay calculated

ay = Ӯ - by(X̄) 

12
New cards

T or F: If X̄ = 0, then Y’ = ay

T

13
New cards

What leads to a flatter slope

Higher variability or heteroscedasticity

14
New cards

What 2 values are used to plot a regression line

  1. Y intercept (0, ay)

  2. The mean of X and mean of Y (X̄, Ӯ)

(Values need to be in range of data)

15
New cards

T or F: The regression line represents the mean for bivariate data

T, the regression lines for Y’ and X’ intersect at the mean for X and Y

16
New cards

What three datapoints are needed to plot Y’ and X’

Y’ - (0, ay) and (X̄, Ӯ)

X’ - (ax, 0) and (X̄, Ӯ)

17
New cards

What happens to the angle of the two regression lines if the correlation is very high

The angle will be very small, if r = ± 1.00 then they overlap, if r = 0 they are at a 90 degree angle (angle increases approaching zero)

18
New cards

What happens to Y’ and X’ respectively if r = 0

Y’ → by = 0, ay = Ӯ and the line is flat

X’ → bx = 0, ax = X̄ and the line is vertical

19
New cards

What are the equations if the relationship is curvilinear (quadratic, cubic or quartic)

Quadratic: Y′ = a + bX + cX2

Cubic: Y′ = a + bX + cX2 + dX3

Quartic: Y′ = a + bX + cX2 + dX3 + eX4

20
New cards

What is Y - Y’

Measure of variability, it is the residual/error or difference between an observed Y and a predicted Y on a regression line. Measures error or prediction around the regression line

21
New cards

Standard error of estimate

A measure of the average deviation of the errors, the difference between the -values predicted by the multiple regression model and the -values in the sample

22
New cards

How is deviation score calculated for SD and SDy-y’

SD = (x - x̄)

SDy’ = (Y - Y’)

23
New cards

How are squared deviations calculated for SD and SDy-y’

SD = (x - x̄)2

SDy’ = (Y - Y’)2

24
New cards

How is sum of squared deviations calculated for SD and SDy-y’

SS = ∑(x −x̄)2

SSy-y’ = ∑(Y − Y’)2

25
New cards

How are SD calculated for SD and SDy-y’

SD = √(∑(x −x̄)2/N)

SDy-y’ = √(∑(Y − Y’)2/N)

26
New cards

T or F: there is an alternate way to calculate SDy’ 

T, SDy’ = SDy√(1-r2)

27
New cards

T or F: when r is not equal to 0, SDy’ will be smaller than SDy

T, because SDy’ reduces the potential error by using information from 2 sources instead of 1

28
New cards

What would happen to SDy’ if r = 0

SDy’ = SDy (because SDy’ = SDy√1-r2)

29
New cards

What would happen to SDy’ if there is a perfect positive or negative correlation (r = +- 1) 

SDy’ = 0 (because SDy’ = SDy√1-r2)

30
New cards

What do we need to know in order to understand explained variability

Total and unexplained variability

31
New cards

Total variability

Denoted with XOY - ȳ

32
New cards

Unexplained variability

Denoted with Xt, Y - Y’

33
New cards

Explained variability

Denoted with Xe, Y’ - ȳ

34
New cards

How can one conceptually explain explained variability

Total variability = prediction + residuals

35
New cards

In what 2 ways can total variability be expressed

SST

∑(Y - ȳ)2

36
New cards

In what 2 ways can explained variability be expressed

SSR

∑(Y’ - ȳ)2

37
New cards

In what 2 ways can unexplained variability be expressed

SSE

∑(Y −Y’)2

38
New cards

Why must the values be squared to calculate total variability (e.g why can we not just do ∑(Y - ȳ) = ∑(ȳ – Y’) + ∑(Y-Y’)

Because the sum of deviations is 0, this is why we must us SS in our regression equation

39
New cards

What is the problem regarding SSR and SSE

We have trouble interpreting SSR and SSE because they are squares

40
New cards

What is the solution to the problem regarding SSR and SSE

We calculate the proportion of variability with regard to explained and unexplained variability

41
New cards

How is proportion of variability used to calculate total variability

Conceptual: Total variability = proportion of explained variability + proportion of unexplained variability

Equation: SST/SST = SSR/SST + SSE/SST

42
New cards

What is the important implication regarding proportion of explained variability and proportion of unexplained variability and their relationship with r

Explained variability = r2 and unexplained variability = 1 - r2

43
New cards

What are the 9 attributes of the regression line

  1. Regression line represented bivariate data in a linear relationship and predicts scores based on observed data

  2. Defined by linear equation for a straight line

  1. Two regression lines can represent bivariate data (Y’ and X’)

  2. Does not predict values outside of the range of data

  3. Is the best descriptor of bivariate data

  4. Reflects the method of least squares (e.g. Σ(Y - Y’)2)

  5. Always has some error of prediction present and is measured as standard error of estimate SDy - y’ (unless r = ± 1) 

  1. Is a traveling normal distribution with a moving mean

  2. Allows separate measures of SST, SSR and SSE

44
New cards

What are the two linear equations for Y’ and X’

Y’ = ay + byX

X’ = ax + bxY