Research methods 3 sem 2 - correlation and regression

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/43

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:32 PM on 5/20/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

44 Terms

1
New cards

Correlation

strength and direction of relationship between 2 variables e.g. +1 perfect positive, -1 perfect negative

2
New cards

whats a Scatterplot used for

A visual representation used to visualize correlation

3
New cards

what different Correlation strength means

0 when independent, 1 when identical and -1 when exactly inverse

4
New cards

what is Pearson coefficient used for

Used for interval/ratio data, e.g., temperature/height

5
New cards

what does Correlation coefficient show

Strength of relationship, doesn't represent slope

6
New cards

what does Regression coefficient represent

Represents slope, not meaningfulness of effect

7
New cards

Arithmetic mean

Add all values up and divide by the number of all

8
New cards

Variance

Measure of how much values differ from the mean

9
New cards

Standard deviation

E.g., width of peak in a distribution

10
New cards

Covariance

Fluctuation between 2 variables; if X and Y are high or both low together = greater covariance, but if one high and other is opposite = negative covariance

11
New cards

Linear regression - when plotted what does it show

When plotted, produces a line represented by y=ax + b

12
New cards

what is Residual error

Dots distance to regression line; smaller residual error = tighter correlation and better regression

13
New cards

what is Error variance

cumulated (squared) differences of empirical (actual) and predicted values

14
New cards

what is Regression variance

Variance of predicted values explained by the model

15
New cards

what is the Residual/prediction error

Difference between actual (Y) and predicted value

16
New cards

Correlation vs Regression

Correlation expresses reliability of relation; regression allows prediction

17
New cards

when are regression and correlation the same?

if x and y have been z-normalized (regression is then informative of relationship strength)

18
New cards

Statistical inference in regression - what is the null stating

Null hypothesis states no prediction of y based on x

19
New cards

Standard error of slope

Error against which regression slope is tested

20
New cards

Partial correlation

Assesses relationship of one pair after accounting for a third

21
New cards

Multiple regression

Generalization of bivariate regression; describes relationship with multiple predictors

22
New cards

Coefficient of determination R2

Proportion of total variance explained by predictors/model or 1 minus the proportion of total variance given by residuals.

23
New cards

Goodness of fit of a regression model - what is coefficient of determination, multiple correlation coefficient and f-ratio

Coefficient of determination (prop of variance explained by regression model), Multiple correlation coefficient (correlation between predicted and observed values), f-ratio - can derive contrasting the proportion of explained variance with residual.

24
New cards

f-ratio for multiple linear regression

Higher f ratio indicates better models.

25
New cards

Testing significance of individual predictors

Only test the significance of individual predictor variables when the entire regression model has been found to be significant.

26
New cards

Multicollinearity

High similarity between 2/more predictor variables.

27
New cards

Effects of multicollinearity

Adding more predictor variables that are correlated to existing predictors changes predictive quality, making it difficult to estimate predictive value.

28
New cards

Singularity

Entirely redundant variable that is an exact combination of 2/more other variables.

29
New cards

Problems with singularity

Logical - don't want to measure same thing twice; statistical - cannot solve regression as system becomes ill-conditioned.

30
New cards

Example of singularity

Intelligence scale WAIS is fully determined by its subscales, containing no additional independent information.

31
New cards

how to detect multicollinearity

Look for high bivariate correlations between predictors. look at tolerance = measure of uniqueness of predictor variable from other variables, low value = problem

32
New cards

how to detect singularity

Look for high multivariate correlations and low tolerance

33
New cards

Multiple regression approaches: Simultaneous

No a priori model assumed; all predictor variables fit together.

34
New cards

Multiple regression approaches: Stepwise

No a priori model; predictor variables added/removed one at a time to maximize fit.

35
New cards

Multiple regression approaches: Hierarchical

A priori knowledge of variables; assesses explanatory power of new variable.

36
New cards

factors affecting multiple linear regression: Outliers

Points deviating from others, having a disproportionate effect on linear regression fit.

37
New cards

what is Cook's distance used for

Measure extremity of outliers; value greater than 1 indicates concern.

38
New cards

what is Scedasticity

Distribution of residual error

39
New cards

what is homoscedasticity

residuals = relatively constant over range of the predictor variable (have constant variance)

40
New cards

what is Heteroscedasticity

Residuals vary systematically across the range of the predictor variable.

41
New cards

what are the Number of predictors

The number of observations should be high compared to predictor variables; results become meaningless as observation number decreases.

42
New cards

what does an Adjusted R2 do

Corrects for the number of predictor variables; reported in results section.

43
New cards

Range of predictor variable

Small range restricts statistical power.

44
New cards

Variable distribution

Should be normal or uniform; only the residuals need to be normal in multiple regression.