Statistics!!!

0.0(0)
studied byStudied by 11 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/51

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

52 Terms

1
New cards

Predictor variable

A factor used to predict changes in a dependent variable

2
New cards

Outcome variable

What researchers aim to study the change of

3
New cards

Prediction residual (error)

actual value - predicted value

4
New cards

Prediction

the value your model thinks y will have for a given x

5
New cards

Random process

a collection of variables that change over time, but in a way you can’t predict exactly, just probabilistically

6
New cards

Random variable

categorical outcomes of random processes translated into numerical representations

7
New cards

Random event

Something that might happen, but you can’t know for sure until it actually happens

8
New cards

The goal of prediction

To find a rule (model) that makes the smallest possible prediction errors overall

9
New cards

Choosing predictor and outcome

Outcome = Y, Predictor = X

10
New cards

Basic linear regression equation

Y(hat) = a + bX. a = intercept = predicted Y when X equals 0. bX = slope = change in predicted Y for each 1 unit increase in X

11
New cards

Least squares

Chooses a and bX so that the sum of squared residuals is as small as possible

12
New cards

General linear model

Doesn’t require the optimization of the smallest sum of squared residuals like “least squares”

13
New cards

R gives:

Intercept = 2.5

Slope = 0.8

Y(hat) = 2.5 + 0.8x

14
New cards

Y (hat)

Predicted value

15
New cards

Interpret slope b (numerical X)

For each 1 unit increase in X, predicted Y increases by b units

16
New cards

Interpret intercept a (numerical X)

Predicted value of Y when X = 0

17
New cards

Interpret slope b (categorical X with 2 levels)

If X = 0(group A) or 1 (group B), then:

b = difference in group means. (How much higher/lower B is compared to A)

18
New cards

Interpret a (Categorical X)

a = predicted outcomes for the baseline group (group coded 0)

19
New cards

What is R2

The proportion of variation in Y explained by the model. Example: R2 = 0.40 means 40% of differences in Y are explained by X

20
New cards

Finding R2 from r2

R2 = r2

21
New cards

How to find r from R2

r = the square root of R2

22
New cards

Predicting using a linear model

Just plug X into Y(hat) = b0 + b1x

23
New cards

Calculating residual

Residual = Y - Y(hat)

24
New cards

4 conditions for least squares regression

Linear relationship. Independent observations. Normal residuals. Constant variance (spread of residuals stays roughly the same across X) LINC

25
New cards

Normal residual

The models errors are roughly bell shaped, centered at zero, and not skewed

26
New cards

Checking for the 4 conditions in a residual plot

Look for linearity, constant variance

27
New cards

Checking for the 4 conditions in a histogram/QQ plot of residuals

symmetry, unimodality, most residuals close to 0, few large residuals (light tails), just be bell-shaped

28
New cards

Checking study design for the 4 conditions (check independence)

Each data point comes from a different individual or independent event. One person’s measurement doesn’t affect another’s. No pairing, no matching, no repeated measures. Data weren’t collected in a way that clusters people

29
New cards

What is unreasonable extrapolation?

Predicting for X-values far outside the range of observed data, which may give nonsense results

30
New cards

High leverage

Unusual X-value

31
New cards

Influential

Point that changes the regression line a lot

32
New cards

Random process

A repeatable process with uncertain outcomes

33
New cards

Outcome

One result of the random process

34
New cards

Event

A collection of outcomes

35
New cards

Disjoint probability

P(A and B) = 0

36
New cards

A and B are independent if

P(A|B) = P(A) or P(A and B) = P(A)P(B)

37
New cards

Addition rule (disjoint A, B)

P(A or B) = P(A) + P(B)

38
New cards

Multiplication rule (general)

P(A and B) = P(A|B)P(B)

39
New cards

Multiplication rule (independent A and B)

P(A and B) = P(A)P(B)

40
New cards

Using tree diagrams (joint probabilities)

Multiply along branches

41
New cards

Using tree diagrams (total probabilities)

Add branches

42
New cards

Using probability tables (marginal probability)

Row/column totals

43
New cards

Using probability tables (conditional)

Cell/row or column

44
New cards

Using probability tables (Joint)

Individual cell value

45
New cards

Probability distribution

A list of all possible values of a random variable and their probabilities

46
New cards

Expected value (mean)

E(X) = win-probability(value) - loss-probability(value)

Interpretation: long-run average value of the random variable.

47
New cards

Variance

Step 1. Find the mean. Step 2. Subtract each individual value from the mean. Step 3. Square all the new values. Step 4. Sum all squared values together and divide by n - 1 if sample population. Divide by n if whole population.

48
New cards

Standard deviation

Square root of variance

Interpretation: average distance from the mean.

49
New cards

Linear combinations (aX + b)

  • Expectation:
    E(aX+b) = aE(X)+b

  • Variance:
    Var(aX+b) = a^2 Var(X)

50
New cards

Sums of variables (X + Y) if independent

Mean: add the means

Variance: add the variances
(Never add SDs)

51
New cards

Var(X) =

E[X - E(X)to the second power] Or

Var(X) = E(X)to the second power - (E(X)) to the second power

52
New cards

E(aX + bY)

aE(X) + bE(Y)