Statistical Tests

Studied by 0 people

0.0(0)

LearnA personalized and smart learning plan

Practice TestTake a test on your terms and definitions

Spaced RepetitionScientifically backed study method

Matching GameHow quick can you match all your cards?

FlashcardsStudy terms and definitions

1 / 66

There's no tags or description

Looks like no one added any tags here yet for you.

67 Terms

What assumptions need to be met for chi-squared tests

• The number of cells with expected frequencies less than 5, are less than 20%
• The minimum expected frequency is at the very least 1.

New cards

How can we use a regression model to predict values we don't have for our outcome variable?

Use linear regression model-> save-> click prediction intervals individual

New cards

What are the three assumptions for Multiple Regression Inference

1. The relationship between the dependent (Y) and each continuous independent variable (x variables) is linear.
2. Residuals or error terms e should be approximately normally distributed.
We can plot a histogram of the error terms
3. Homoscedasticity (stability in variance of residuals)

New cards

What are the steps to test mediation effect (Baron and Kenny )

1 test that the IDV(X) is associated with the DV(Y) c
2 test the association between the IDV and mediator(M) (a)
3 test the association between the mediator on the DV (b)
4 test the IDV association with the DV when controlling for the mediator (c')

New cards

What are the two methods to test indirect effects

- Sobel test (Normal Theory Approach)
- Nonparametric Sobel test, PROCESS (bootstrapping)

New cards

What is the sobel test for

Measuring the indirect effect of the mediator, ie ab

New cards

How do you calculate the estimated linear effect of a categorical variable in the interaction

B1+B3 x levels of modifier (0,1)

New cards

What is the confidence interval for?

A statistical estimate of how "good" the test statistic is, with lower % confidence being more cautious

New cards

When do we use an independent sample t test

for normally distributed continuous data, to test the differences between two groups/ variables

New cards

When do we use a Pearsons chi squared test

If two categorical variables are associated, that meet assumptions
To test if according to the current data, the proportions of two groups are significantly different from each other.

New cards

What is the McNemar test used for

test for paired categorical data, that meets assumptions, to see if proportions changed between the paired data (eg over time)

New cards

What do we need to look at before reporting a pearsons chi squared test

Assumptions:
• The number of cells with expected frequencies less than 5, are less than 20%
• The minimum expected frequency is at the very least 1.

New cards

What do we need to look at before reporting an independent samples t test

the Levene's test, if the levenes test is significant t p<.05, use second row- 'equal variances are not assumed"

New cards

What does X2 stand for

Chi square

New cards

What test do we use for non-parametric data to compare one group to another group

Mann-Whitney U test

New cards

What test is used for comparing independent categorical groups that don't meet assumptions

Fisher's exact test

New cards

What are the assumptions of the McNemar test

For McNemar's test, we needed at least 25 discordant observations, and paired 2×2 data, if there is less, or 3×3 or more data we look to the McNemar Bowker test

New cards

How do we interpret a scatterplot

See if there is a linear relationship (either positive or negative) and if we can add a linear fit line

New cards

How do we test the 3 assumptions for Multiple Regression Inference

1. Use scatterplot, residuals of the dependent variable (Y) plotted against residuals of each independent variable (x)
2.We can plot a histogram of the error terms to see if the errors more or less follow a normal distribution, and p plot
3. A scatterplot of standardised residuals epsilon and standardised predicted values shows no pattern. (ZPRED, ZRESID)

New cards

How do we test distributions

For continuous variables we use the histogram plot

New cards

How to we test our one sample data against a prediction or other values of population?

Use t test or chi square
Before using test make sure to analyse descriptives!! Look at distribution of data

New cards

What do we use the One sample chi square test for

Use to test proportion of results in data, default at 50/50
To test if according to the current data, the proportion in the population equals a certain, pre-specified, value.

New cards

How do we calculate the 95% confidence interval?

x̄ +/- (1.96 x SE)

New cards

How do we calculate the standard error

SE= sample SD / square root of sample size (n)

New cards

How do we calculate a confidence interval

x̄ +/- (CI x SE) , repeat for both upper and lower bound

New cards

What type of test to we use for a binary categorical Independent variable of interest and a continuous Dependent variable

T test (if parametric)

New cards

What type of test to we use for a categorical Independent variable of interest and a categorical Dependent variable

chi squared test (if assumptions met)

New cards

What type of test to we use for a continuous Independent variable of interest and a continuous Dependent variable

Simple linear regression

New cards

What type of test to we use for multiple continuous and categorical Independent variables of interest and a continuous Dependent variable

Multiple Linear Regression

New cards

When do we use a paired sample t test

When comparing two means from the same group of respondents, with both continuous variables being normally distributed

New cards

What test do we use for non-parametric data to compare one group to a pre-defined value

Wilcoxon sign rank test

New cards

What test do we use for non-parametric data to compare two related groups

Wilcoxon paired/ matched sign rank

New cards

When do we split a variable?

To check the distribution of levels of a categorical variable separately (eg gender or ethnicity) against a continuous outcome

New cards

When do we merge two variables?

Only merge paired variables when we want to check the distribution/suitability, we use the differences between paired data to check if it is normally distributed

New cards

What test do we use categorical data with >20% cells exp count less than 5 to compare one group to a pre-defined value

One sample Binomial Exact test

New cards

What test is used for comparing paired categorical groups that don't meet assumptions

McNemar Bowker test/ paired Binomial exact test

New cards

What do you report for non-parametric tests?

medians, with min-max, p value and df

New cards

What visualisation is used to see if two continuous variables are associated

Scatterplots are used to see the association between two continuous variables

New cards

Why do we use a correlation

When we need an objective measure of strength of a linear relationship.
Correlation 'r' is a method to quantify the Direction and Magnitude, of linear association between two continuous variables.

New cards

When do we use Pearson's correlation

When we have two continuous variables with a linear relationship and the data is parametric on a histogram

New cards

When do we use Spearman's correlation

When we have two continuous variables with a linear relationship and the data is not parametric/ skewed on a histogram

New cards

How do we interpret the results of a correlation?

Coefficient r : r value ranges from -1 to 1, the closer to 1/-1, the stronger the linear correlation (magnitude)
, and p value, the probability

New cards

What is a simple linear regression for

In statistical modelling, a regression model is a set of statistical processes for estimating the relationships among variables. These models describe the relationship between variables by fitting a line to the observed data.

New cards

What is the equation for a simple linear regression

Y=B0+B1 𝒙 + 𝜺

New cards

What is the difference between R2 and R2 adjusted

The R2 value indicates how much of the total variation in the dependent variable, the adjusted is this when adjusting for the number of predictors

New cards

What do we do when we have a categorical predictor that we want to test

We code it through dummy variables, with one level acting as a reference category (0)

New cards

What are dummy variables represented by

0 and 1

New cards

How many dummy variables are needed to represent a variable with 3 levels?

2, labelled d1 and d2, one will be the reference category

New cards

How do you use dummy variables in a linear regression with a categorical variable with 3 levels

Input coded dummy variables as usual in regression
If you input all dummy variables (including coded reference category) SPSS will choose a reference category and exclude it
The results table shows the dummy variables COMPARED to the reference category, and if there is a significant difference in the outcome

New cards

What is the equation for a multiple linear regression

Y=B0+B1 𝒙1 + B2 𝒙2 + 𝜺

New cards

What does x1 or x2 represent in the MLR model

x1 and x2 represent the independent variables of interest

New cards

What is the multiple linear regression equation that has a categorical predictor with 3 levels

Y= B0+B1 d1+ B2 d2+ 𝜀

New cards

What does R2 represent in a MLR model

R2 is often interpreted as the proportion of the variance in the dependent variable that is "explained" by the independent variables in the model.

New cards

What does a higher R2 mean

higher values of R2 indicating better prediction.

New cards

What does a B1 of 2 tell us?

That for a 1 unit increase in idv 1, there is an increase of 2 for the dv

New cards

What does M stand for

M = the mediator

New cards

What does a mediator do

A mediator (M) of the causal effect of independent variable (x1) on dependent variable (Y) is a variable x2 on the causal pathway from x1 to Y.

New cards

What is the symbol for the pathway of the total effect

New cards

What is the difference between complete and partial mediation

Partial mediation, there is still a significant effect of the IDV on the DV when controlling for the mediator but the effect is reduced
Complete mediation, the mediator eliminates the significant effect of the IDV on the DV

New cards

What is the equation to compute the indirect effect

SE(ab) = sqr a2 Sb2+ b2 Sa2

New cards

What is PROCESS used for

Check the 95% Bias-corrected bootstrap confidence interval in the model, both direct and indirect effects

New cards

What is needed for the indirect effect to be significant

the confidence interval does not contain 0

New cards

Which steps in the Baron Kenny method are essential to establish mediation

Steps 2 and 3 are essential for establishing mediation

New cards

What do you do if there might be a modifier

Create a new variable for the interaction between the IDV and the modifier
Then run a multiple linear regression including the new modifier with the previous IDV(s) and DV

New cards

What does Z represent

the modifier

New cards

What is B3

𝛃𝟑 is interpreted as the difference of the effect of 𝐱𝟏 on Y by levels of 𝐙 variable.

New cards

What are the DFBETA and DFFIT used for?

For outliers and strong influencers on the data
Look at both ascending and descending ( both - and + >1)
In the SDF1 column, if >1 has a strong influence on the model, may need to be removed

New cards