stats y2

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/82

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

83 Terms

New cards

General Linear Model (GLM)

statistical framework that describes the relationship between a DV and one or more IVs

New cards

when to use a GLM

to test a hypothesis with a numeric outcome

e.g. regression, correlation, t test, anova

New cards

linear regression

tests whether there is a linear association between numeric variables, often continuous variables

works with categorical predictors as an alt to anova (dummy code predictor)

New cards

What does a significant association in linear regression indicate?

slope and intercept are significantly different from 0

New cards

What are the assumptions of multiple regression?

No multicollinearity, homoscedasticity, linear relationship between DV and IVs, normally distributed residuals.

New cards

centering values

mean removed from each datapoint in variable

intercept becomes mean DV value

=to make intercept more interpretable

New cards

df in multiple regression

dfm- number of variables in the model

dfe- number of ppts-number of variables-1

New cards

What does the F statistic in ANOVA represent?

The ratio of variance due to differences between groups to variance within groups. (F = MSbetween/MSwithin)

New cards

group effects

the deviation of each group mean from the grand mean

New cards

What is the significance of Mauchly's test in repeated measures ANOVA?

It tests the assumption of sphericity, which is the equality of variances of the differences between treatment levels.

New cards

What is a two way anova used for

when there are two categorical IVs

New cards

What is the null hypothesis for testing the interaction between two independent variables in a two way anova

no effect of interaction on DV

New cards

What is the formula for the individual score in the context of ANOVA?

Individual score (DV) = grand mean + main effect A + main effect B + interaction + error.

New cards

why use a Bonferroni correction in post hoc tests

reduce chance of type 1 error from multiple comparisons

New cards

What does the term 'homogeneity of variances' refer to in ANOVA?

The assumption that different groups have similar variances.

New cards

assumptions of anovas

observations are independent events and identically distributed

homogeneity of variances

normality of residuals

New cards

if assumptions for anova not met

do anova anyway (less power and risk of type 2 error

transform the dv

use kruskal wallis non parametric test

New cards

What is the difference between fixed effects and random effects in ANOVA?

Fixed effects are chosen levels of factors

random effects are factors selected randomly from a population.

New cards

What is the significance of the residuals in regression analysis?

Residuals are the differences between observed values and predicted values, indicating the model's accuracy.

New cards

What does 'dummy coding' refer to in regression analysis?

Transforming categorical predictors into numerical format for analysis. df loses 1

New cards

What is the main effect in the context of ANOVA?

The individual effect of one independent variable on the dependent variable.

New cards

grand mean

average score across all subjects no matter the condition

New cards

df in two way anova

DFmaineffect = k-1 for each factor

DFinteraction= DFaxDFb

New cards

sphericity

The assumption that the variances of the differences between treatment levels are equal between repeated measures deigns with 3+levels

New cards

mauchlys test

tests whether variances of differences between conditions are equal

if p<0.5, F test has too many false positives and no sphericity

correct df using greenhouse geisser correction or huyh-fedlt correction

New cards

What is ANCOVA?

An extension of ANOVA that includes a numeric covariate that may explain additional variance in DV

New cards

strengths of ancova

reduces within-group variance (error) as co variate explains some of the error

eliminates confounds by including them

New cards

contrast coding in linear models

A method to transform categorical predictors into numeric values for analysis, allowing for comparisons against a reference level.

New cards

dummy contrast coding

one level of a categorical variable is defined as the reference (0) and other levels are compared to it (1)

intercept shows mean of reference level and slope shows the difference between each other level and the reference level

New cards

successive differences coding

tests whether there are differences between levels sequentially, with the intercept representing the grand mean and slope showing mean differences between groups

New cards

deviation coding

A method used for factors with two levels, assigning one level -0.5 and one 0.5

intercept becomes grand mean and slope shows mean dif between groups

New cards

What are the five important assumptions in order in statistical modeling according to Gelman and Hill (2007)?

Validity, Linearity and additivity, Independence of errors, Homogeneity of variances, Normality of residuals.

New cards

multicollinearity

when predictors are strongly correlated with each other

=can inflate standard errors and complicate regression coefficient estimation. above 0.7 concern

New cards

What does the Variance Inflation Factor (VIF) measure?

The relationship between variables, with values greater than 5 indicating high collinearity, which is a concern in regression analysis.

New cards

Cook's distance

detects influential outliers in regression by investigating how much predicted values change if an observation is removed

higher value= affects data more

New cards

centering

Shifting a variable so its mean becomes 0, keeping the same shape size and relationship

New cards

z score transformation

rescaling variable so its mean is 0 and sd is 1

New cards

What is log transformation used for?

To smooth out tails of a distribution and make data more normally distributed.

New cards

Akaike Information Criterion (AIC)?

A measure used to compare different models, where a smaller AIC indicates a better fit.

New cards

Linear Mixed Models (LLMs)?

Models that account for random effects in data with nested or hierarchical structures, allowing for varying intercepts or slopes.

New cards

fixed effects in LLMs?

Explanatory variables hypothesized to affect the DV.

New cards

random effects in LLMs?

Categorical grouping variables considered random samples from a larger population, like participants or schools.

New cards

What is REML?

Restricted Maximum Likelihood, the default parameter estimation criterion for linear mixed models.

New cards

random effect variables

categorical, ideally 5+ levels, represent a sample from a broader population

New cards

What is the purpose of centering variables in LLMs?

To improve interpretability of the intercept and reduce multicollinearity issues.

New cards

What is the significance of including random slopes in a model?

To examine how predictors interact with random effects, though it can complicate the model and lead to overfitting.

New cards

nested random effects

a lower level grouping factor exists only within one specific level of a higher level factor

New cards

crossed random effects

a factor appears in multiple levels of another factor

New cards

What is the purpose of transformations in statistical analysis?

To meet assumptions of normality, linearity, and homogeneity of variance, improving model accuracy.

New cards

What does it mean if a model fails to converge?

It indicates that the model is too complex or improperly specified, often due to overfitting or insufficient data.

New cards

p-value

probability of observing a test statistic that is at least as extreme or more extreme than the one we observed if the null hypothesis is true and we repeat our experiment many times

C: doesnt tell you the null is false

New cards

alpha level

threshold for declaring significance (5%)

C: levels should be different for different contexts (Fisher, 1935)

New cards

effect size

tells us how practical the result is in the real world

‘small’ <0.2 cohens d hedges g, 0.1-0.3 correlation, 0.01 anova, cohens f 0.1

‘medium’ <0.5 cohens d hedges g, 0.3-0.5 correlation, 0.06 anova, 0.25 cohens f

‘large’ 0.8 cohens d hedges g, >0.5 correlation, 0.14 anova, 0.4 cohens f

New cards

eta squared n2

effect size for main effects in anova that tells us the proportion of variance in the dv explained by the predictor

New cards

partial eta squared n2p

proportion of partial variance after accounting for the other predictors in the model that the predictor explains in the dv

New cards

generalised eta squared n2g

estimates the effect size in a design where only the term of interest was manipulated, accounting for the fact that some terms cannot be manipulated

-formula depends on design

New cards

cohens f (partial)

a transformation of partial eta squared when the population means are equal and an indefinitely large number as the means are further and further apart

New cards

factors that affect power

sample size- larger gives larger power

expected effect size- larger gives larger power

type 1 error rate- as tolerance for type1 error in, power inc

reliability of measures- more reliable larger power

New cards

power

probability of correctly rejecting the null, if type 2 error rate is .2, power is .8

New cards

priori power analysis

what sample size do we need to have 80% power in detecting an effect size

New cards

sensitivity power analysis

what is the smallest effect size we can detect with the power, sample size, and alpha we have

New cards

meta analysis strengths

+higher power than individual studies

+overall effect across studies

+can identify potential publication bias

+can explore impact of design and analysis decisions

+more accurate estimation of effect size for future power analysis

New cards

fixed effect model for meta analyses

assumes all studies are estimating the same population effect size. any error is due to sampling

New cards

random effects model for meta analyses

allows the population effect to differ between studies. allows for differences in design, population, dosage

New cards

heterogeneity of meta analysis

the extent to which effect sizes vary between studies

New cards

heterogeneity measures

cochrans Q statistic- difference between observed effect sizes and overall effect size

I2- percentage of variability in the effect sizes not caused by sampling error

tau-squared- alternative measure of heterogeneity between study

New cards

funnel plot

plots effect size against precision of the study

studies with high precision should cluster at the top
studies with low precision should scatter widely at the bottom
unsymmetrical could be sign of publication bias

New cards

error/residuals (anova)

difference between individual points and its group mean

New cards

sum of squares

summarises the total variance associated with each component of the model

SS= SSwithin+SSbetween

New cards

mean squares

we cant use ss to compare so use mean squares

MSwithin= SSwithin/DFbetween

MSbetween= SSbetween/DFbetween

New cards

homoscedasticity

similar amount of variation in y at each value of x

New cards

VIF

1/1-R2

New cards

calculate effect n2 for anova

SSeffect/SStotal

New cards

mu with a hat

sample estimate of the population mean value

New cards

if linear mixed effects model doesnt converge

change number of iterations of optimiser

remove correlations between random effects

New cards

dummy coding intercept and slope

intercept is mean of reference level

slope is mean difference between levels

New cards

calculate each groups mean in deviation coding

intercept +0.5xslope

New cards

anova equation

Individual score (dependent)= grand mean + group mean difference from grand mean + error (diff between group mean and individual score)

New cards

df for ancova or multiple regression

number of observations- (k-1)

New cards

multiple regression formula

Dv= intercept + predictor slope 1x IV1+ predictor slope 2XIV2+error

New cards

degrees of freedom

linear regression N-1

multiple regression N-(K-1)

anova N-1

ancova N-(K-1)

fixed effects lmm depends

random effects lmm number of random variables

meta analysis K-1

New cards

poisson distribution

count variable

variance = mean

New cards

when to use z score

when you want to compare distribution and coefficients across predictors with different scales