TOPIC #5: Linear model

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/18

flashcard set

Earn XP

Description and Tags

Module #2 Data1001

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

19 Terms

1
New cards

When can you use a linear regression?

Given bivariate data (x,y)—> meaning that the data involves a pair of variables, and the research question is (is y linearly related to x?)

2
New cards

What are the steps to a linear regression?

  1. Produce a scatterplot

  2. calculate the correlation coefficient

  3. produce a regression line

  4. produce a residual plot

  5. check assumptions

  6. perform predictions

3
New cards

what does linear correlation mean?

referring to the association between two variables to describe how tightly the cloud of points cluster around a line through the centre.

4
New cards

what does it mean if there is a strong linear correlation?

the cloud of points are tightly clustered around a line and this allows for good predictions of 1 variable (y) from the other (x).

5
New cards

What does it mean when one variable tends to increase with the other?

We have a positive association.

6
New cards

what does it mean when one variable tends to decreases with the other?

we have a negative association.

7
New cards

what is the correlation coefficient?

(r ), is a numerical summary which measures the clustering of points around a line and can indicate both the sign and strength of the linear association. (the correlation coefficient is between -1 and 1)

8
New cards

what does it mean if r is positive

the cloud/clustering of points, slopes up

9
New cards

what does it mean if r is negative

the cloud slopes down

10
New cards

what does it mean if r gets closer to +- 1?

the points cluster more tightly around the line.

11
New cards
<p>Describe each graph. </p>

Describe each graph.

  1. Strong negative

  2. moderate negative

  3. weak negative

  4. weak positive

  5. moderate positive

  6. strong positive

12
New cards

what is the population correlation coefficient

(rpop) is the mean of the product of the variables in standard units

13
New cards

what are the properties of the correlation coefficient?

  1. value (lies between -1 and 1)

  2. symmetry (correlation coefficient is not affected by interchanging the variables)

  3. scaling (correlation coefficient is shift and scale invariant)

14
New cards

how do you find the optimal line for predicting values in a linear model

using a regression line

15
New cards

what is a regression line?

a straight line that best fits a set of data points, and it's used to predict one variable from another.

the regression line connects (x, y) to (x + SDX, y + rSDy)

16
New cards

What is a residual?

the vertical distance or gap of a point above and below the regression line. A residual represents the error between the actual value and the prediction.

17
New cards

What is a residual plot?

A residual plot graphs the residuals (on the vertical axis), vs x or the fitted values y (on the horizontal axis).

18
New cards

If a linear regression is appropiate for the data, what should the residual plot show?

No pattern, it should be random about a horizontal line at 0 and homoscedasticity (meaning it has a constant variance within vertical strips along the x axis)

19
New cards

does correlation measure causation?

no, it measures association, however association does not mean causation.