Module 9 Correlation

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/31

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

32 Terms

1
New cards

Correlated variables

“move” together. If one variable increases, the other either tends to increase or decrease.

2
New cards

X increases as Y increases

 the variables have a positive correlation

3
New cards

Y decreases as X increases

the variables have a negative correlation

4
New cards

“r”

  • The length or severity of correlation.

  • linear correlation coefficient

5
New cards

r = 1

perfect positive correlation

6
New cards

r= -1

means perfect negative correlation

7
New cards

r= 0

 means no correlation 

8
New cards

|r| > 0.9

correlation is strong 

9
New cards

0.6 < |r| < 0.9

correlation is moderate

10
New cards

0.6 > |r|

the correlation is weak

11
New cards

Null hypothesis

Null hypothesis states that variables are not correlated.

  • Inference for correlation

12
New cards

Reject  ρ = 0

results are large enough to be unlikely to occur by chance, assuming that the null hypothesis is tru

13
New cards

Statistically significant

 Often used when a null hypothesis is rejected

  • Does not indicate strength of correlation

14
New cards

 ρ = 0 is true

there is no tendency for Y to change as X changes

15
New cards

Theoretical regression model

yi = 𝛽0+𝛽1𝘹i+𝟄i

  • Describes where data comes from

  • Contains unknown parameters (𝛽0,𝛽1) and a random variation

16
New cards

yi

The response or dependent variable 

17
New cards

𝛽0

Population intercept

18
New cards

𝛽1

population slope

19
New cards

𝘹i

Predictor or independent variable

20
New cards

𝟄i

 random variation around the line

21
New cards

Estimated model

 The equations for the line of best fit 

ŷi=b0+b1xi

  • Is the line itself and is drawn through the data

    • Contains statistical estimates with no random variation

22
New cards

Interpreting b0

predicted value of y when x=0

23
New cards

Interpreting b1

The slope, prediction for y for a one unit increase in X

24
New cards

Residuals or errors

  • difference between an actual observed value of y and the predicted value of Y at an observed value of x

    • residuals= yi - ŷi = observed value - predicted value

25
New cards

Sum of squared errors/ residuals (SSE)

  • The sum of all the squared deviations between each data point and line of best fit

26
New cards

Sums of squares for residuals (SSE)

  •  tells us how spread out the data are around the line of best fit

  • The greater the SSE, the more spread out the data are around the line of best fit

  • The line of best fit minimizes the sum of the squared residuals

    • The best fitting line has a small SSE

27
New cards

Small sample size

  • very “spread out” around the population LOBF

28
New cards

When sample sizes are large

  •  each sample LOBF will tend to be closer to the population LOBF. Sampling distribution of LOBF will be less spread out.

29
New cards

Confidence interval

95% CI for 𝛽1 = b1 +/- 2(seb1)

30
New cards

 reject the null

we’re saying that we have strong enough evidence that there is a linear relationship between x and y on the population level

31
New cards

Larger R2

  •  there’s a stronger linear relationship, all the points are closer to the line of best fit.

32
New cards