1/33
Explain what is meant by replication and why it is important Discuss why it might be appropriate to doubt some scientific findings Discuss how a more ‘open’ science may help
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
function of replication
gives more confidence in findings → increased trust in scientific findings results from reliable data
what is replication
repeatedly finding the same results
what does replication do (Schmidt, 2009)
protects against false positives, e.g. sampling error
controls for artifacts
addresses researcher fraud
test whether findings generalise to different populations (conceptual replication→ testing different samples, e.g. diff cultures- check generalisability)
test the same hypothesis using a different procedure
types of replication
direct
conceptual
direct replication (Zwaan et al., 2017)
a scientific attempt to recreate the critical elements (e.g. samples, procedures, and measures) of an original study
direct replication results as indicator
same or similar results indicate findings are accurate and reproducible
conceptual replication
to test the same hypothesis using a different procedure (e.g. using different samples, research design, etc)
conceptual replication results as indicator
same/similar results indicate findings are robust to alternative research designs, operational definitions, and samples
the reproducibility of psychological science
issues- many findings not replicated
just 36% of studies replicated overall (cog + soc)
social psych- only 23-29% findings replicated
cognitive psych- 48-53% replicated
(Open Science Collaboration, 2015)
the reproducibility of social psychological science
enhanced in social psych compared to cog, in a sample of 100 studies
tho studies sample from one pop, we are attempting to generalise to the whole population in order to learn about human nature → study need to be reflective of objective truth
problem for psychology especially
Open Science Collaboration (2015)
Cristea et al., (2021) title
review article of effect sizes reported in highly cited emotion research compared with larger studies and meta-analyses addressing the same questions
Cristea et al., (2021): findings
highly cited observational studies:
had effects greater on average by 1.42 fold (95%CI = 1.09, 1.87) compared with meta analyses
had effects greater on average by 1.99 fold (95% CI = 1.33, 2.99) compared with largest studies on same questions
highly cited experimental studies:
had increases of 1.29 fold (95% CI = 1.01, 1.63) compared with meta analyses
had increases of 2.02 fold (95% CI = 1.60, 2.57) compared with the largest studies
more highly cited papers → typically reported much larger effect sizes than better estimates of population averages/effect sizes
substantial between topic heterogenity
key takeaway: more extreme findings more likely to be used and less likely to be replicated
Cristea et al., (2021): procedure
most cited studies adjusted by how influential they are in the field
did a systematic review of all studies
comparing highly cited with meta analyses and large studies with the same questions
what does Cristea et al., (2021) highlight
more extreme findings more likely to be used/cited and less likely to be replicated
reasons for non replication
fraud
‘sloppy’ science → 9 circles of scientific hell. - flawed research practices
outcome switching - 'p value fishing’, ‘p hacking’
small samples/ lack of statistical power (also sloppy science)
moderators
scientist error/ poor replications themselves
publication bias (VII- non publication, VIII- partial publication)
Diederik Stapel
was influential social psych on impression formation + stereotypes
50 papers retracted → in early years of research, when he collected ‘real’ data, he laid out complex and messy relationships between variables (in the way psych often is)
editors preferred simplicity even tho this is not reflective of reality → cut down to main effect to tell a coherent, story-like narrative about psych phenomena
still collected data to test his Hs, but redid the experiments + created datasets to fit the narratives he set out
whistleblower was one of his PhD students, who offered to collect data for them
enduring influence of flawed science + Diederik Stapel
if 2/3 of findings cannot be replicated, and many papers been retracted due to fraud- what can we believe?
cases of fraudulent data is rare
‘sloppy’ science
dubious research practice, poor science
9 ‘circles’ of scientific hell (Neuroskeptic, 2012)
I - limbo
II - overselling
III - post-hoc storytelling
IV - p-value fishing
V - creative outliers
VI - plagarism
VII - non-publication
VIII - partial-publication
IX - inventing data
sloppy science + 9 circles of scientific hell
examples of poor science, to inform better practice
as problems become less serious → often more common
overselling outcomes of research → findings need to be published, which requires explaining why findings matter/are meaningful. creeping significance.
creeping significance
issue of p values above .05 being insignificant when the cutoff is fairly arbitrary
outcome switching pertains to…
IV: p value fishing
outcome switching
changing the outcomes of interest in the study depending on the observed results
an example of p hacking: taking decisions to maximise the likelihood of statistically significant effect, rather than on objective or scientific groups
ANOVAs + outcome switching
running separated ANOVAs → significantly increasing likelihood of making a type 1 error
if one anova is sig but only report that one anova, bad science → selecting convenient outcome based on findings, whereas research was acc capitalised on chance
need to be reliable test of the hypothesis → null findings just as important as significant findings, as evidence that factors do not effect an outcome
one issue- multiple reasons for null findings → less likely to be published
e.g. method may have been poor vs there acc being no effect