Stats 2 Final

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/37

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 11:13 PM on 4/14/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

38 Terms

1
New cards

General linear model

Generalized linear model

Generalized linear mixed model

Methods for data that's normally distributed. Assumes X and Y have a linear relationship. Examples: t-test, ANOVA, simple/multiple regression

Methods used for DV that is not normally distributed (DV is binary). Logistic regression.

Methods for nested data (hierarchical levels of grouped data). Hierarchical linear regression

2
New cards

Nested Data & Example

Hierarchical levels of grouped data

 Two or more than two levels

 Data cases in a lower level are included in only one higher level group (The 1st level variable is affected by the 2nd level variable)

Example:

1st level: Employee Turnover Intentions

2nd level: Economic Uncertainty

3
New cards

Levels

Individual, department, organization, society, etc

4
New cards

Mediation analysis

Tests a hypothetical causal chain where one variable X affects a second variable M and in turn that variable affects a third variable Y

5
New cards

Mediators & Example

How and why a relationship between two other variables

Training changes M:Self-Efficacy which in turn impacts Performance

6
New cards

Baron and Kenny’s 4-step indirect effect method

Step 1 Estimate the relationship between IV on DV must be significant, and effect size is not 0

Step 2 X must effect M (mediator) and the effect size must be more than 0

Step 3 M must effect Y and path must be significant and not 0

Step 4 If C' is a non significant, M is a full mediator

If C' is significant but becomes smaller, M is a partial moderator

7
New cards

ACME

ADE

Total Effect

ACME: Sig / ADE: Non-sig / Total Effect: Non-sig

Average Causal Mediation Effects (indirect effect, path A -> B - X to M to Y) (If insignificant, no mediator)

Average Direct Effects (path C' - X to Y)) (if significant and not 0, could have partial mediator)

Sum of Indirect & Direct Effects (not required for mediation to exist)

Indirect-only / suppression

8
New cards

Bootstrapping

A method to estimate the variability of a statistic by repeatedly resampling the observed data

Simulation method, more suitable for small sample sizes

Does not assume a specific distribution

P values assume normal distribution, therefore Bootstrapping is needed to assess confidence intervals by creating artificial data based on your original data set.

9
New cards

Sampling Distribution

The result of Bootstrapping’s artificial data creation

10
New cards

95% confidence interval ratio for mediation analysis after bootstrapping

If it does not include 0, it is significant

11
New cards

Moderation Analysis

Moderator

Name a moderator

Tests whether a variable affects the direction and or strength of the relationship between IV and DV

Moderator affects when a relationship occurs

Workload - perceived social support - burnout

12
New cards

Moderated mediation (more common)

When there is a moderator that affects a mediator's relationship with Y

Starting point is the moderator and the IV

Example: Ability influences performance and is mediated by job knowledge but it is stronger when supervisor support is high

13
New cards

Mediated moderation

Take the L

14
New cards

Centering

Center the IV and Moderator W before estimating the model

Transfer a variable so that its mean becomes 0 by subtracting the mean

Subtracts the mean of a variable from each value in that variable

Reduces multicollinearity and make interpretation easier

Makes main effects interpretable

15
New cards

Multiple Regression vs Logistic Regression

MR: DV is a quantity and ranges to infinity (continuous or interval data) and has a linear relationship with IV

LR: DV is 0 or 1, and calculates probability. Uses logarithmic transformation to Y to linearize relationship between IV and DV

16
New cards

Why not use MR when you have binary DVs?

If you use MR for dichotomous DV, it violates heterogeneity of variance/homoscedasticity (MR assumes variance of errors remains constant across all values of X, but binary DV violates this assumption (close .5 probability, error variance is large, close to 0 or 1, EV is low.) A one unit increase in X does not lead to a fixed increase in Y.

MR can produce DV estimates greater than 1

Linear models can’t express slope of probability varying depending on IV

17
New cards

Turnover analysis Pros/Cons of

Group Comparison (T-test)

Correlation

Regression

T-test: Test one variable at a time, hard to discern importance

Correlation: Which variables are related to turnover? Outcome is binary, so correlation is unclear

Regression: Can predictors explain turnover? MR assumes continuous variables, but outcome is binary

18
New cards

Odds

The likelihood of an event occurring compared to the likelihood of it not occuring

19
New cards

Odds ratio & Interpretation

Comparing the odds between two groups, with the reference/baseline group as the denominator.

OR>1 Event is more likely in the numerator group than the reference group

OR=1 Event is equally likely in both groups

OR<1 Event is less likely in the numerator group than the reference group

20
New cards

OR with continuous predictor

Interpret as, “what happens when X increases by 1 unit.”

OR = Odds at X+1 / Odds at X

21
New cards

Marginal Standardization Approach (MSA)

In Logistic Regression, probability changes are not constant. They depend on the starting value of X.

Two methods:

1: Specify the starting point (a 1 unit increase in X changes Y (probability) to _%

2: Use the average effect (average marginal effect (AME)) (On average, a 1 unit increase in X changes probability by _ percentage points.

22
New cards

Adjusted R Squared

Shouldn’t be interpreted, but can be compared. Nagelkerke’s and McFadden are common

23
New cards

Logistic Regression Assumptions & Pairwise Deletion

No outliers, no multicollinearity, DV is binary, Appropriate sample size, no perfect separation, independence of observations, linearity of log odds relationship

Pairwise shouldn’t be used in LR: Violates Maximum likelihood estimation assumptions, inconsistent sample size, unreliable standard errors. Use Listwise deletion or multiple imputation.

24
New cards

LRT: Likelihood Ratio Test

Equivalent to F test in linear regression

Compares the full model with a null (intercept-only) model

Evaluates overall model significance

Reported as Chi Square statistic

Can assess contribution of individual/significance of predictors

25
New cards

Wald Test

Focus on Statistical significance of predictors

Beta Coefficients indicate direction and can be compared across predictors (OR > 1 increases likelihood, OR < 0 decreases likelihood of outcome

CI doesn’t include 1: statistically significant

26
New cards

Hosmer-Lemeshow Test

Sensitive to large sample size

P is insignificant (> .05): Model is acceptable

Compares predicted probabilities with observed outcomes.

27
New cards

Callibration plot

Above and below the mean indicates…

Evaluates how well predicted probabilities match observed outcome frequencies. Predicted is X, Observed is Y

Above the line: Underestimates

Below the line: Overestimates

28
New cards

ROC Curve and AUC (Area Under the Curve)

Evaluates predictive performance

AUC ranges 0-1. 1 is perfect prediction.

.7-.8 is acceptable

.5-.7 is poor.

less than .5 is worse than random prediction

29
New cards

Accuracy

How often is the model correct overall?

Proportion of correctly classified cases

30
New cards

Sensitivity (True positive rate)

How well does the model detect the event?

31
New cards

Specificity (True negative rate)

How well does the model detect non-events.

32
New cards

Sensitivity and specificity

Are inversely related. The cutoff point leads to a trade-off

33
New cards

Determining cutoff point

Can be determined using Youden’s Index

Sensitivity+Specificity-1: optimizes Maximizes balance between

Or ROC Curve (AUC):

Point closest to the top left corner

34
New cards

Pulse Survey & Benefits

5-15 items

Weekly/quarterly

2-5 minutes

Specific focus

Benefits: Low burden/high response rate, Real time insights/detect issues early, quicker improvements, Monitor change

35
New cards

Survey Limitations

Survey fatigue

Limited opportunities

Poor design (not actionable, poorly formatted for analysis)

36
New cards

Good surveys are designed

Backward from action

If your survey doesn’t lead to action, it has little value.

With validity, reliability, and practicality in mind

To average and compare across time/depts/benchmarks/tenure.

37
New cards

Surveys can include

Interviews, focus groups, and open-ended feedback.

Engagement surveys, pulse surveys, exit surveys/interviews, onboarding surveys

Methods vary.

38
New cards

Surveys communicate

Organizational values and priorities

Shape norms, culture, and expectations