1/126
50% exam
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is a (bivariate) correlation
This examines the strength of the relationship between 2 variables, e.g ice cream sales and temperature
What is a linear regression?
Allows us to predict one variable (DV/ criterion) from a series of other related variables (IVs/ predictors), e.g ice cream sales can be predicted by temperature. NOT CAUSAL
What is regression?
Statistical technique that allows us to predict someone’s score on one variable from their score on:
one variable (bivariate regression)
more than one variable (multiple regression)
What other name for DV. What other name for IV
DV= criterion variable
IV= predictor variable
What is multiple regression?
A statistical technique that builds a hypothetical model of a relationship between a single criterion variable (DV) and multiple predictor variables (IVs).
– A predictive model that best predicts an outcome/ criterion variable.
– Produces a regression equation that can be used to make predictions
• To predict someone’s exam score, we assess their performance in a previous exam and how many hours they revised and their attendance at lectures, and their motivation.
What are some Data Requirements for Multiple Regression
Sample size; normality; linearity; multicollinearity;
homoscedasticity (5 ASSUMPTIONS) S N L M H
Hypotheses for Multiple regression?
H0: There is no linear relationship between the criterion variable and the predictor variables.
• H1: There is a linear relationship between the criterion variable and at least one of the predictor variables
Formula for sample size?
N> 50+8M
M is the number of predictor variables
(Tabachnick and Fidell’s 2007)
What is the assumption ‘Multicollinearity’?
And what is singularity
A high inter-correlation ( r > +/-.90) between the predictors, e.g 3 predictors are highly correlated. Creates issues for regression models!
Singularity: a perfect linear relationship between variables (more intense form of multicollinearity)
What are the guidelines for suggesting no multicollinearity?
Tolerance> 0.50
VIF <10.00
Correlations between predictors < 0.9
What is a residual?
The difference between the value a model predicts, and the value observed in the data on which the model is based, e.g when you have a linear plot, it’s the difference between the line of best fit and the plotted data,
e.g a negative residual= when predicted value is too high; a positive residual= when predicted value is too low.
How do we check for normality?
. On histogram (line of best fit)
. On P-plot.
line of best fit attempts to minimise the residuals
What is the assumption Homoscedasticity
. This is the SPREAD of scores/variance.
. Check using a scatterplot
. The assumption that the residuals at each level of the predictor variable have similar variances/ evenly distributed.
What things would you see on a scatterplot for homoscedasicity vs heteroscedacity?
HOMO- random spread of scores, equal number below and above the line
HETERO- fan shape score spread or bow tie shape
How do we check for the assumption linearity?
The predictor variables should be lineally related to the criterion variable.
What are the 3 types of multiple regression?
– Standard / Simultaneous
– Hierarchical
– Stepwise
Describe standard multiple regression:
All the predictor variables entered into the model simultaneously.
. Use this approach if you have a set of variables and want to know how much variance in a criterion
. Tells you how much unique variance each predictor variable explains.
How do we assess the model via SPSS
. See R, R Square and Adjusted R square (can convert to %)
. See significance level (should be less than 0.05) and standardised coefficients beta
. See df and F value
What is the structure to formally report results?
A standard multiple regression was used to assess the ability of four predictor variables: age, depression, social support and sleep duration to predict PTSD. All predictor variables were entered simultaneously.
Preliminary analyses were conducted to ensure no violation of the assumptions of normality, linearity, multicollinearity and homoscedasticity. None of the assumptions were violated. Sample size of 292 exceeded the required amount.
61.7% (adjusted R2) of the variation in PTSD symptoms could be explained by the variation in age, sleep duration, depression scores, and social support, F (4, 287) = 118.44, p< .001.
Depression made a significant contribution to the model (p< .001). Depression made a positive contribution, suggesting that increases in depression are associated with an increase in PTSD symptoms.
What do you also say when talking about implications of research?
. How can our findings benefit people? Who does it affect and what interventions can be put in place?
What does a bigger R squared value mean
The more variance the combo of predictors can explain in the outcome variable
Lecture 2:
Multiple hierarchal regression
What is the regression equation and what does it show?
> Allows us to predict the value of the criterion variable (Y) from a set of predictor variables (X1, X2, X3)
> Allows us to predict how Y will change as a result of changes in X
> Equation is Ŷ = b0 + b1x1 + b2x2… + error
. Y= the criterion/DV
. B0= the intercept/constant (so the value for DV when predictors are 0)
B1= regression coefficient of the predictor
X1= name/value of the predictor
How did we find the values of B0, B1 etc on SPSS? What do you need to report for this and what is the wording for this?
Unstandardised coefficients B column
. The value of the constant/intercept=
. The number of units the criterion (Y) changes by for each unit increase in the predictor (x), while controlling for the other predictors
What are the assumptions for multiple hierarchal regression?
. Linearity
. Normality
. Homoscedasicity
. Multicollinearity
. Sample size
What happens if you have too small of a sample size?
You will be underpowered to find a significant result
. The more predictors, generally the bigger the sample size
Multicollinearity for hierarchal multiple regression?
. Limits the size of R
. Makes determining the importance of the given predictor difficult
. VIF, tolerance, correlations values
How do we deal with multicollinearity?
check for errors in data entry/coding
If there’s a fairly large set of predictors- reduce this to a smaller set of predictors
Consider deleting/omitting a predictor that is highly correlated with another predictor
HOWEVER< it may juts be that the predictors are truly highly correlated
How do we check normality for Hierarchal multiple regression?
P-plot and histogram
. See the normal distribution line, if on the P-lot it is not close to the predicted line, there is a lack of normality
How do we check for linearity in hierarchal multiple regression?
The predictor variables should be lineally related to the criterion variable (should be a straight line). Check on P-Plot
How do we check for homoscendacity for hierarchal multiple regression?
. Check on scatterplot which plots the standardised residuals against the predicted values
. The residuals at each level of the predictors should have the same level of variance
. When the variance is unequal - Heteroscedastic
What is an outlier in regression?
An observation (case) that is substantially different from the others
. Can haver large impacts on the results of regression analysis
How do we detect outliers in SPSS
. Scatterplots and residual plots (but this is subjective)
. Residuals statistics tables: Standardised residuals and Cook’s Distance
What are standardised residuals for outliers??
Helps to identify anyone whose predicted score is quite different from their actual score
(Tabachnick and Fiddell, 2001): values that fall outside the safe zone + or - 3.3
What is Cook’s distance for outliers?
. Cook’s distance measures the influence of deleting a case
. So any large values (Over >1 indicates that the case considerably affects the estimates regression coefficients)
Describe hierarchal multiple regression?
. AKA Sequential
. Predictor variables are entered into the equation, in the order specified by the researcher (generally known predictors from previous research are entered first, then new predictors in successive models)
. Variables or sets of variables are entered in steps (blocks)
. Each block of predictor variables are assessed in terms of what it adds to the criterion (DV) variable, after previous blocks of variables that have been controlled for.
What about categorical predictors in regression?
Predictors should be quantitative, or if they are categorical, they should be dichotomous (exhaustive and mutually exclusive)… so you fall into 1 or another group
Regression models should include mostly quantitative predictors (and not all categorical)
They re coded as dummy variables,e g. Boys 1, Girls 0, Graduated 1, Non-graduated 0
BE careful about….
. Make sure you consider what the variables mean, e.g low scores on a test means positivity (when you may expect low scores to be a bad thing)!!
When we are predicting the effect of age, education and physical memory tests and PRMQ questionnaire on dementia what are the criterion and predictor variables
Criteria- dementia
Predictor- age, education and physical memory tests and PRMQ questionnaire
Write what you would say when assessing the model to say adjusted Squared for different models/blocks?
After the variables in Block 1 (Age, education, physical memory test score) have been entered, the overall model explains 51.7% of the variance in dementia symptomology (Adjusted R2=0.517)
After the Block 2 variable (PRMQ test) has been included, Model 2 explains 61.5% (Adjusted R2= 0.615).
What do you look at to assess significance?
Model summary table for: R square change (is it above 0.001)
ANOVA table for : model 1 and 2 significance
What table do you look at when evaluating each of the predictors variables?
Coefficients to see each predictor variable’s individual contribution
How do we formally report the results the Hierarchal Multiple Regression?
A hierarchical multiple regression was used to assess the ability of self-reported memory scores on the PRQM to predict dementia symptomology after controlling for the variation explained by age, education and physical memory test.
Preliminary analyses were conducted to ensure no violation of the assumptions of normality, linearity, multicollinearity and homoscedasticity. Assumptions were met, although homoscedasticity was questionable.
Age, education and scores on physical memory test were entered in Model 1, explaining 51.7% of the variation in dementia symptomology (Adjusted R2= .517), F(3,86)= 32.716, p< .001. Age and physical memory test were significant predictors of dementia symptomology (p<.001). Education was not a significant predictor.
After entry of the PRMQ score in Model 2, the model explained significantly more variance, F(4, 85) =36.564, p < .001; R2 change= 0.099, p< .001. The total variation in dementia symptomology explained by the model as a whole was 61.5%, (Adjusted R2= .615). The significant predictors in model 2 were age, physical memory score, PRMQ score (p< .001). Education was not a significant predictor.
In the final model, age and PRMQ score made positive contributions, suggesting that increases in age and increases in scores on the PRMQ are associated with an increase dementia symptomology.
Physical memory test scores made a negative contribution to predicting dementia symptomology, suggesting that increases in scores on the physical memory tasks are associated with a decrease in dementia symptomology.
Explain the regression line of best fit?
. the model’s predicted values are plotted on a regression line, which passes through a scatterplot. The regression line (of best fit) attempts to minimise the residuals.
LECTURE 3
Stepwise multiple regression
What 5 things can we do with outliers?
Transform variable
Replace value
Delete value
Delete participant
Non-parametric test
What is stepwise multiple regression?
• Method of regression that adds multiple predictors while simultaneously removing those that don’t improve the R2 value
• SPSS selects ONLY the predictors which provide the strongest prediction of variance in the outcome variable.
• The aim is to create the best model fit and achieve the highest R2 value with the fewest predictors.
Researcher provides a list of predictor variables and then allows the software to select which predictors to enter, and which order they go in.
• Variables are added to the regression equation one at a time, with an attempt to maximise the R².
– Default criteria in SPSS, p< .05.
• After each variable is entered, each of the included variables are tested to see if the model would be better if it were excluded.
What are the advantages of stepwise multiple regression?
. Saves time: Stepwise regression can help identify the most important predictors for a given outcome variable in a relatively short amount of time.
• Useful for exploratory data analysis: an exploratory tool to identify potentially important predictors that can be further investigated and refined using other statistical techniques.
What are the disadvantages of stepwise multiple regression?
. Overfitting: When selecting variables based on their statistical significance/predictive power, the resulting model may perform well on that sample but generalise poorly to new data.
• Biased estimates: can produce biased estimates and incorrect conclusions when there are correlations between the predictor variables.
How do we formally report stepwise regression?
Stepwise multiple regression was used to identify the best predictive model of misophonia from the predictors: OCD (washing, checking, ordering, obsessing, hoarding, neutralising) and anxiety.
Preliminary analyses were conducted to ensure no violation of the assumptions of normality, linearity, multicollinearity and homoscedasticity. An outlier was identified but was (/was not) omitted from the final analyses.
A final model was identified where anxiety (p<.001) and OCD checking (p= .033) explained 26.5% (Adjusted R2= 0.265) of the variance in misophonia, F (2, 149)= 28.17, p< .001. OCD washing, ordering, obsessing, hoarding and neutralising were not significant predictors in the final model.
Anxiety and OCD checking made positive contributions, suggesting that increases in these predictors are associated with an increase in misophonia levels. However, these two variables only explained 26.5% of the variability in misophonia, suggesting there may be other factors that contribute to predicting the variability in misophonia levels.
What are Cohen’s d table for effect sizes?
shows if it constitutes a good model
Cohen (1988) Interpretating R squared
<0.02 Very weak
0.021- 0.13 Weak
0.131- 0.26 Moderate
> 0.261 Substantial
How may we interpret results for stepwise multiple regression….
The results suggest increases in anxiety and OCD checking are predictive of increases in misophonia levels.
• Misophonia may be linked to anxiety, and the checking component of OCD, but the results from the present sample suggest that the other components of OCD, namely: washing, ordering, obsessing, hoarding and neutralizing did not contribute to explaining the variance in misophonia.
What are 3 problems of stepwise entry?
When there are small differences between variables the computer will unquestioningly choose the largest for addition at each step.
Danger that none of the variable are included in the equation, as the variables fail to meet the rules of the stepwise method.
There is a lack of researcher control.
LECTURE 4
ANOVA
What are 3 types of t-test
Independent samples t-test: compares means from two independent groups.
Paired samples t-test: compares means from two sets of individuals.
• Repeated measures design
• Matched-subjects design
One-sample t-test: compares an observed mean to a population mean.
What test do we use when we want to compare more than 2 conditions, or more than 2 means, e,g group 1 no music, group 2, constant music, group 3 intermittent music
ANOVA (analysis of Variance)
Why would we not just use several T-tests though?
e.g carry out 3 separate t-tests to see the differences between the 3 groups
. The Experimentwise Error Rate (EER) / The Familywise Error Rate (FWER)
. The probability of making at least one Type I error (false positive) across a set of multiple hypothesis tests in a single experiment/ study. (say something is significant, when it actually isn’t)
• When you perform just one statistical test at a significance level of α = 0.05, there’s a 5% chance of incorrectly rejecting a true null hypothesis.
• But when you perform many tests within the same experiment, the chance of making at least one false positive increases dramatically.
. SO you increase the chance of making a Type 1 error
What is ANOVA?
• A parametric inferential test used to test for variability in scores
• Used when we have more than two groups/ conditions (levels) and/or more than one independent variable (factor)
– Statistically advantageous over performing multiple t-tests on the same data
• A major advantage= it allows you to investigate the effect of multiple factors on your dependent variable at the same time (in combination)
– Factorial ANOVA
How are t-tests and ANNOVA similar and when would you use both?
• Both compare means between-groups
• With 2 groups both work but:
. t-test more efficient
. ANOVA inefficient (not needed in terms of parsimony)
• With more than 2 groups:
. t-test not efficient
. ANOVA more efficient
What are some research questions ANOVA could address?
. Are there attitudinal and behavioural differences between different generations?
. Are there differences in reaction times for drivers:
• Hands-on phone drivers
• Hands-free phone drivers
• No-phone drivers
And does gender interact with this?
What are 4 assumptions of ANOVA
. Level of data
. Normality
. Homogeneity of variance
. Independent random samples
What is the levels of data assumption?
The dependent variable (DV) consists of data measured at interval or ratio level (quantifiable)
…. not nominal or ordinal (categorical)
Interval:
Put scores in an order, with equal distances (intervals) between numbers. No true zero; zero doesn’t mean and absence of the variable. e.g temperature, score on intelligence
Ratio:
Like interval BUT with a true zero. There can be a total absence of the variable, e.g speed in miles, number of children in a household
What is the assumption normality? What are the 2 ways we can check it?
. The data for the dependent variable/s (factor/s) is normally distributed
. Check histogram of DV data
. use skewness and kurtosis- convert these scores into z scores (divide statistic by standard error)… If z is outside the zone of +-1.96, then it is significant (p<.05) and suggests non-normal data
What is the assumption homogeneity of variance?
. The samples being compared are drawn from populations with the same/ similar variance
. Levene’s test of equality of error variances, or on histogram
What is the independent random samples assumption?
For independent groups designs, independent random samples must have been taken from each population. (Via random allocation)
How does ANOVA work?
• ANOVA analyses the different sources that cause variance in the DV
• It analyses the variability between conditions (between-groups variance) and within conditions (within-group variance)
• Basically, is the variance primarily due to differences between groups, or differences within groups?
–Is there a true effect of the IV (factor) on the DV?
What is between-group variance and what are 3 sources that impact this?
• Between-groups variance is the variance (difference) between group means, e,g no music has a mean 9, but intermittent had a mean of 22
SO variance BETWEEN the groups!!!
Arises from…
Individual differences
Treatment effects
Random effects
What are treatment effects?
This is the effect of the IV(s)/ factors
Variance due to different groups of people, in different conditions, behaving differently from each other
We anticipate a difference between experimental conditions
What are individual differences?
• People naturally vary.
• We don’t want a high amount of individual differences as this might lead us to think our IV is having an effect when it isn’t
• For example: reaction time to identify famous paintings.
What are random effects?
–Errors of measurement can arise from a variety of sources such as:
• Varying external conditions (time of day, temperature)
• State of the participant (tired, motivated)
• Experimenter’s, or computer’s, ability to measure and score accurately (same instructions, demeanour)
What is within group variances and what are 2 factors which influence this?
. AKA error variance (as the difference is not due to the Iv, but error)
. It’s the variation between people within the same group, e.g there’s a difference in test scores within the groups for groups 2 (as someone scored 8, someone else scored 30)
Affected by
random effects
Individual differences
SO what is the logic of the between and within group variations?
–subjects in different groups should have different scores because they have been treated differently (i.e. given different experimental conditions)= Between-groups variance
–subjects within the same group should have the same/ similar score= Within-groups variance
When would we use the null hypothesis, and the alternative hypothesis? Give an example for both
NULL= The populations from which the samples have been drawn have equal means. e.g There will be no difference in test scores for students in the 3 conditions
ALTERNATIVE= the populations from which the samples have been drawn do not have equal means. e.g ‘There will be a difference in test scores for students in the no music, constant low music or intermittent music conditions.’ (non-directional) (NO PREVIOUS RESEARCH)
e.g ‘Students in the constant low music condition will perform better in the test, compared to students in the no music or the intermittent music condition.’ (directional) (PREVIOUS RESEARCH)
What does it mean if the between groups variance is larger than the within groups variance?
Statistically significant value for F, and can conclude there is a significant difference
How do we calculate the F value
We want to see if our manipulation of the IV/factor is responsible for the differences between scores
F = variance due to manipulation of IV/factor DIVIDED BY error variance
ANOVA calculates the ratio of the variance due to our manipulation of the IV (between-groups variance) and the error variance (within-groups variance)
F = between-groups variance DIVIDED BY within-groups variance
How do we find out if the F-ratio is statistically significant?
• If the F-ratio is larger than 1, we need to decide if the value is large enough to be statistically significant
• The p value needs to be equal to or less than 0.05 for the F ratio to be regarded as statistically significant
What’s the difference of F in ANOVA vs multiple regression?
• Multiple Regression is the statistical model that is used to predict a continuous outcome on the basis of one or more continuous predictor variables.
– The F statistic is the test of the fit of the linear model
• ANOVA is the statistical model that is used to predict a continuous outcome on the basis of one or more categorical predictor variables.
– The F statistics is the test of fit for the group means
• The F ratio in ANOVA is exactly the same as in regression, except that the regression model for ANOVA contains categorical predictors.
SO what are factors?
This is the IV, e.g music type
What are factor levels
. conditions of the IV, e.g no music, constant, intermittent
What is a mixed ANOVA design?
Mixed ANOVA is used when a study design includes one or more within-subjects factors and one or more between-subjects factors, E.g., looking at the effect of music type (between-subjects; 3 levels) at different times of the day (within-subjects; 3 levels) on test performance
When labelling ANOVA, what does a one-way or two-way or three-way mean?
• The number indicates the number of factors (IVs)
–One-way ANOVA (one factor, e.g. music type)
–Two-way ANOVA (two factors, e.g. music type and time of day)
–Three-way ANOVA (three factors, e.g. music type, time of day and gender)
What does a 3×3 ANOVA mean
tells us the number of levels we have…
e.g Time of day has 3 levels (morning; afternoon; evening)
LECTURE 5
ONE-WAY ANOVA
What doe s higher F score mean
More likely statistically significant
ANOVA has 4 assumptions- which 2 do you do before doing the ANOVA, and which ones do you do when you are doing the ANOVA?
BEFORE- interval/ratio data, check samples have been independent random
AFTER- homogeneity of variance, normality
Where do we check for normality?
. Histogram
. calculate skewness and kurtosis
How do we check for outliers?
Boxplots
. An outlier if there is a circle, and extreme outlier if there is a star
How do you calculate the mean square for the between groups and within groups?
Between groups sum of squares divide by between groups df
Within groups sum of squares divide by within groups df
How do you calculate the f value?
Between groups mean square divided by within groups mean square
What do profile plots show?
visual representation of scores each condition. When the error bars do not overlap, this suggests that the conditions are likely to be significantly different.
What is the effect squared for ANOVA called?
eta squared?
What are planned vs unplanned comparisons??
1. Planned (a priori) comparisons
– Conducted when the researcher has hypothesised which means will differ from each other in advance, e.g this will be a larger reaction time
– The overall main effect does not need to be significant to run planned comparisons.
2. Unplanned (post hoc) comparisons
– Differences in means explored after data has been collected.
– Don’t use if overall main effect is not significant
. Normally you would only conduct either planned OR unplanned
What are Cohen’s d effect sizes (1988)
0.2= small
0.5= medium
>0.8= large
How do you report a planned comparison??
Planned comparisons were performed to test the two hypotheses. See Table 1 for mean (and SD) recall of each group.
The mean recall scores of the morning group were significantly higher than the evening group, t (27) = 4.14, p<.001, with a large effect size (d= 1.80).
Furthermore, the mean recall scores of the afternoon group were significantly higher than the evening group, t (27) = 4.56, p<.001, with a large effect size (d=2.27).
Give an example of interpreting the results?
The results suggest that recall for the evening group was lower than both the morning and afternoon group. There was no significant difference between the morning and afternoon.
This suggests that to aid student’s recall of information, morning (9am) and afternoon (2pm) teaching sessions produce greater recall thanevening (9pm) teaching sessions.
LECTURE 6
One way within-subjects ANOVA
What does a one-way within-subjects ANOVA involve and give an example
. one factor (IV)
. repeated measures so everyone takes part in ALL conditions
e.g is there a difference in the recall of information depending on the time of day it was presented (morning, afternoon, evening)
Advantages of repeated measures design?
. Increased statistical power
. Removes the effect of individual differences
. Fewer participants needed
. Less time and money
What are some disadvantages of repeated measures design?
. Practice effects
. Fatigue
. Contrast effects (order effect)
. Demand characteristics
How can we remove order effects?
Randomising the order of testing
• For two conditions (A and B) we could randomly determine whether A or B is experienced first
Counterbalancing order of testing
• For two conditions (A and B) half the participants would experience condition A followed by B and the remaining half would experience B followed by A