Outliers, Correlation, and Regression

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/26

flashcard set

Earn XP

Description and Tags

Flashcards covering outliers, correlation, regression, and linear models.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

27 Terms

1
New cards

Outliers

Points that lie far away from the typical variation in your data; can be identified using 1.5*Interquartile range, Z score cutoff (+/- 2.5 or even 3), or impossible values.

2
New cards

Correlation

A standardized covariation measure that gets around the unit issue, ranging from -1 to 1; it estimates the parameter ρ (rho) with the statistic r.

3
New cards

Correlation

Tells us how close our data lie to a trendline.

4
New cards

P-value

Tells us how likely it is that our data would be that close to a trendline by random chance.

5
New cards

Correlation - When to Use

Summarizes the direct relationship between two variables.

6
New cards

Regression - When to Use

Predicts or explains the numeric response.

7
New cards

Correlation - Able to quantify the direction of the relationship?

Yes

8
New cards

Regression - Able to quantify the direction of the relationship?

Yes

9
New cards

Correlation - Able to quantify the strength of the relationship?

Yes

10
New cards

Regression - Able to quantify the strength of the relationship?

Yes

11
New cards

Correlation - Able to show cause and effect?

No

12
New cards

Regression - Able to show cause and effect?

Yes

13
New cards

Correlation - Able to predict and optimize?

No

14
New cards

Regression - Able to predict and optimize?

Yes

15
New cards

Correlation - X and Y are interchangeable?

Yes

16
New cards

Regression - X and Y are interchangeable?

No

17
New cards

Correlation - Uses a mathematical equation?

No

18
New cards

Regression - Uses a mathematical equation?

y = a + b(x)

19
New cards

General Linear Models

Models how a dependent variable (Y) changes over an independent variable (X).

20
New cards

Slope

The “m” in the equation Y = mX + b.

21
New cards

Y intercept

The “b” in the equation Y = mX + b.

22
New cards

Ordinary Least Squares

Estimates β1 (slope) and β0 (y-intercept) using the formula: β1 ~ b1 = cov(X,Y) / var(X)

23
New cards

Fitted values

Predicted values; calculated as Ŷi = b1 · Xi + b0

24
New cards

Residual sum of squares SS Error

The sum of squared deviations; calculated as ∑(Yi - Ŷi)^2

25
New cards

Model sum of squares SS Regression

The sum of squared deviations; calculated as ∑(Ŷi - Ÿ)^2

26
New cards

Total sum of squares SS Total

The sum of squared deviations; calculated as ∑(Yi - Ÿ)^2

27
New cards

Coefficient of determination R2

Measures the proportion of variance in the dependent variable that can be predicted from the independent variable(s); calculated as R^2 = σ(Ŷi - Ÿ)^2 / σ(Yi - Ÿ)^2 or R^2 = 1 - σ(Yi - Ŷi)^2 / σ(Yi - Ÿ)^2