data analysis l7/8

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/67

There's no tags or description

Looks like no tags are added yet.

Last updated 1:00 PM on 4/19/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

68 Terms

New cards

What are the assumptions of a general linear model (in order of importance)?

Random sampling; independence of data; homogeneity of variances; normality.

New cards

Which assumptions relate to study design?

Random sampling and independence.

New cards

Which assumptions relate to the data?

Homogeneity of variances and normality.

New cards

What is pseudo-replication?

Non-independent data treated as independent.

New cards

Why is pseudo-replication a problem?

It makes statistical results invalid.

New cards

What should you do if assumptions are violated?

1) Assess if the violation is serious, 2) Try transforming the data, 3) Consider a different test.

New cards

What is homogeneity of variance?

Variances are equal across groups or along a covariate.

New cards

What happens if variances are unequal?

Increases risk of Type I errors (false positives).

New cards

How does heterogeneity of variance affect p-values?

It often decreases p-values, increasing false positives.

New cards

When is heterogeneity of variance most serious?

When one variance is much larger, when variance changes systematically with fitted values, and when sample sizes differ between groups.

New cards

When is heterogeneity of variance less serious?

When differences are random or when results are clearly non-significant.

New cards

What is normality in GLMs?

Residuals are normally distributed.

New cards

Are violations of normality usually serious?

No, unless the violation is very large.

New cards

When is lack of normality most serious?

When residuals are clearly non-normal (e.g. bimodal).

New cards

When is lack of normality least concerning?

When p-values are very small or very large.

New cards

What are residuals used for?

To assess model assumptions (variance and normality).

New cards

What is a log transformation used for?

Right-skewed data or when variance increases with the mean.

New cards

What type of data suits log transformation?

Large counts, ratios, skewed data.

New cards

What issue occurs with log transformation and zero values?

log(0) is undefined.

New cards

How is the log transformation adjusted for zeros?

Use log(x + 1).

New cards

Do log10 and natural log differ in effect?

No, they have identical effects (log10 is more interpretable).

New cards

What is the arcsin transformation used for?

Proportion or percentage data.

New cards

When is arcsin transformation especially important?

When values are near 0 or 1.

New cards

What is the arcsin transformation formula?

arcsin(√p).

New cards

What is the square root transformation used for?

Count data, especially small counts.

New cards

Give examples of square root transformations

√x, √(x + 0.5), √(x + 3/8).

New cards

How do you know if a transformation worked?

Check residual plots to see if assumptions improve.

New cards

When should you check assumptions?

Before interpreting results (p-values and effect sizes).

New cards

Data are right-skewed or variance increases with mean. What transformation?

Log transformation.

New cards

Data are proportions (0-1), especially near 0 or 1. What transformation?

Arcsin transformation.

New cards

Data are counts (especially small values). What transformation?

Square root transformation.

New cards

What must you always check after transforming data?

Check residuals to see if assumptions improved.

New cards

What is the goal of transformation?

To meet model assumptions (not to improve p-values).

New cards

What does a residuals vs fitted plot show?

Whether variance is constant (homogeneity of variance).

New cards

What does a Q-Q plot show?

Whether residuals are normally distributed.

New cards

What does a histogram of residuals show?

The distribution of residuals (normality).

New cards

What does a funnel shape in a residuals vs fitted plot indicate?

Heterogeneity of variance (unequal variance).

New cards

What is a factor in a GLM?

A categorical variable representing a type of manipulation (e.g. drug).

New cards

What are levels of a factor?

The different categories within a factor (e.g. Drug A, B, C).

New cards

What is a 2-factor GLM used for?

To analyse the effects of two factors and their interaction on a response variable.

New cards

What is a fully crossed design?

All combinations of levels from both factors are present.

New cards

What is a main effect?

The effect of one factor averaged across levels of the other factor.

New cards

What is an interaction?

When the effect of one factor depends on the level of another factor.

New cards

What does a significant interaction mean?

The effect of one factor changes depending on the other factor.

New cards

If there is a significant interaction, can main effects be interpreted separately?

No.

New cards

What visual pattern indicates no interaction?

Parallel lines.

New cards

What visual pattern indicates an interaction?

Non-parallel lines.

New cards

What visual pattern indicates a strong interaction?

Crossing lines (crossover).

New cards

What are the three questions a 2-factor ANOVA can answer?

Does factor 1 matter? Does factor 2 matter? Is there an interaction?

New cards

Why is a 2-factor GLM better than multiple 1-factor tests?

It reduces Type I error and allows testing interactions.

New cards

What assumption becomes more difficult in 2-factor GLMs?

Independence of measurements.

New cards

Why is pseudo-replication especially tricky in 2-factor designs?

Because non-independence can occur across factors.

New cards

What must be true to model an interaction?

There must be replication for each combination of factor levels.

New cards

What is replication in a 2-factor design?

More than one observation per combination of factor levels.

New cards

What happens if there is no replication?

You cannot estimate an interaction.

New cards

When should you include an interaction in a model?

When the research question involves interaction or data suggest it.

New cards

What should you do if an interaction is significant?

Interpret the interaction, not the main effects.

New cards

What do parallel lines in an interaction plot mean?

No interaction.

New cards

What do non-parallel lines mean?

Interaction present.

New cards

What do crossing lines mean?

Strong (crossover) interaction.

New cards

Scenario 1: Does the effect of a blood pressure drug depend on gender?

Placebo mean > Drug mean → treatment effect; Female = Male averages → no gender effect; Drug effect same for both genders → no interaction.

New cards

Scenario 2: Does the effect of a blood pressure drug depend on gender?

No difference between placebo and drug → no treatment effect; Females higher than males → gender effect; Treatment effect same across genders → no interaction.

New cards

Scenario 3

Placebo > Drug → treatment effect; Females > Males → gender effect; Size of treatment effect same in both → no interaction; both factors have independent effects.

New cards

Scenario 4

Treatment lowers value overall → treatment effect; Females > Males → gender effect; Drug effect much larger in females → interaction.

New cards

Scenario 5

Treatment affects both sexes but unequally → interaction; both main effects present.

New cards

Scenario 6

Average values identical → no main effects; effect reverses between sexes → crossover interaction.

New cards

Why can visual inspection of interactions be misleading?

Because of uncertainty (error bars).

New cards

How should interactions be confirmed?

Using statistical tests (p-values and effect sizes).