Notes on Methods of Explanation: Critical Thinking and Replication in Psychology
Replication and Context in Psychology
The Reproducibility Project at the Center for Open Science (Charlottesville, Virginia, USA) reran 100 psychology experiments.
Found that over 60% failed to replicate (i.e., their findings did not hold up on second testing).
Results published in August 2015 in Science.
Public reaction ranged from alarm to cautious validation of concerns about psychology’s reliability.
Important clarification: a failure to replicate is not necessarily a sign that the original finding is false.
If two well-designed studies A and B investigating the same phenomenon reach opposite conclusions, the situation is not a simple failure but a signal to investigate contextual conditions under which the phenomenon occurs.
The scientist’s task after a failure to replicate is to identify the conditions that make the phenomenon true, leading to new hypotheses and better tests.
Context can dramatically affect outcomes:
Example: when a rat is restrained during a tone, its heart rate can go down rather than up, and if the cage design allows, the rat may run away instead of freezing.
This shows how seemingly minor experimental details and environmental factors shape results.
Contextual effects help explain why some phenomena fail to replicate across settings.
Analogy: even simple statements like "the sky is blue" depend on time of day, air composition, scattering of light, and the observer’s color perception.
Broader perspective: many phenomena fail to replicate when context changes, but this does not invalidate the underlying effect in the right context.
Notable counterpoint: Newton’s laws are not universal across all contexts (e.g., subatomic or extreme regimes); quantum mechanics emerges when classical laws fail in certain contexts.
The fear-learning paradigm: a classic context-sensitive example in psychology:
Rats in a small box with an electrical grid; tone followed by shock; freezing, increased heart rate and blood pressure.
Repetition strengthens tone–shock association; eventually, the tone alone elicits the response (freezing) even when shock is omitted.
Early interpretation treated fear learning as universal; later work showed that context (tone, cage, shock, timing) influences outcomes.
The rhetoric around a “replication crisis” is often overstated; many so-called crises reflect a misunderstanding of what science is and how it progresses.
Quote-style synthesis: Henry Gee (Nature) described science as a method to quantify doubt about a hypothesis and to locate the contexts in which a phenomenon is likely; failure to replicate is a feature, not a bug.
The takeaway: science advances through context-sensitive testing and refinement, not through chasing universal, one-size-fits-all laws.
Source context: from The New York Times (2015) and related discussions on scientific practice.
Critical Thinking and the Origins of Bias in Evidence
Francis Bacon (Novum Organum, 1620) introduced the modern scientific method, emphasizing empirical evidence.
Critical thinking involves asking tough questions about interpretation, bias, and completeness of the evidence; many have trouble doing this effectively (Willingham, 2007).
Two natural human tendencies undermine critical thinking:
1) We see what we expect or want to see (confirmation bias).
2) We ignore what we cannot see (missing evidence bias).Bacon’s core insight: human understanding tends to adopt opinions and then retroactively shape evidence to support them.
Armadillo example: threats in the wild differ from threats on a highway; natural defensive tendencies may not adapt well to modern contexts.
In research, these tendencies lead to selective interpretation and overgeneralization when context is not carefully considered.
Quote paraphrase: the human tendency to see what we expect or want to see is powerful and can skew interpretation of data.
The role of context in interpreting evidence becomes a central theme in critical thinking.
The idea that two natural tendencies are major obstacles to objective reasoning.
The chapter argues that good critical thinking requires recognizing these biases and compensating for them rather than denying their existence.
Evidence, Perception, and the Valuation of Data
A well-documented pattern: people interpret the same evidence differently depending on prior beliefs.
Darley & Gross (1983): participants watched the same video of a girl, Hannah, taking a reading test but were told she came from different socioeconomic backgrounds.
Those told Hannah was from an affluent family rated her abilities higher than those told she came from a poor family, despite identical video evidence.
Both groups invoked video evidence to support their preconceptions, illustrating how beliefs color evidence interpretation.
This illustrates how context and prior beliefs shape what we deem as supporting or refuting evidence.
The broader lesson: evidence does not speak for itself; interpretation is theory-laden and biased by expectations.
The broader literature shows that bias can operate at multiple levels, including what counts as quality evidence (Koehler, 1993).
Missing Evidence and the Power of What We Don’t See
The old Roman temple story: the question is not only what the priest shows but what is missing (pictures of those who perished).
The failure to consider missing evidence is a common cognitive pitfall.
Newman et al. (1980): trigram task
Participants were told one trigram was special; for half the participants, the special trigram contained the letter T.
They needed about 34 trials to detect this feature; but for the other half, the special trigram lacked T, and they never figured it out.
This demonstrates that it is much easier to notice what is present than to notice what is absent.
The tendency to ignore missing evidence can lead to erroneous conclusions (Wainer & Zwerling, 2006).
The bar graph (Figure 2.12) on hours spent partying vs studying at Canadian universities: simple interpretation may mislead if missing evidence (e.g., scatterplot showing correlation) is ignored.
The World War II armor problem (Wald): analysts focused on where planes that returned had bullet holes (fuselage) and ignored missing evidence (holes in engines)
Wald argued that the damaged engines were the critical vulnerability; armor should be placed on engines, not fuselage.
The overall rule set: first, doubt what you can see; second, consider what you don’t see.
Beliefs, Desires, and the Interpretation of Evidence
Bacon’s assertion that belief and emotion color interpretation: evidence is not neutral; people bring desires to the table (Hornsey, 2020).
Lord et al. (1979): death penalty and deterrence
When presented with mixed evidence, participants who supported the death penalty became more supportive; those who opposed became more opposed.
Confirms that people interpret evidence to reinforce preexisting beliefs rather than to test them impartially.
Koehler (1993): scientists rating quality of studies is biased by whether results confirm or disconfirm their beliefs.
Echo chambers in modern media: Del Vicario et al. (2016) showed that people tend to surround themselves with like-minded others, reinforcing beliefs.
First rule of critical thinking: doubt your own conclusions; seek out dissenting viewpoints; share results with colleagues likely to disagree (balance and self-critique).
Second rule of critical thinking: consider missing evidence; actively seek information that could challenge your beliefs.
Practical guidance: to be right, engage with critics (even enemies), not only with friends who agree.
The Skeptical Stance: Rules for Good Thinking in Action
Science is a human enterprise; errors occur because humans are fallible; people see what they expect.
The goal of critical thinking is not to be right by default but to minimize bias and improve judgment through critical examination of beliefs and evidence.
First rule: doubt your own conclusions and invite critique; this helps produce a more balanced view of evidence.
Second rule: consider what you don’t see; missing data can change conclusions and point to new hypotheses.
Social dynamics matter: modern science and criticism rely on peer feedback and external critique to maintain objectivity.
The overarching message: treating replication results as evidence about the strength of a theory rather than as a simple verdict of truth or falsity keeps scientific inquiry productive.
Practical Implications, Analogies, and Real-World Relevance
The “fear learning” paradigm shows how context-specific results can generalize only under certain conditions, highlighting the need to map contextual boundaries.
The armadillo and highway examples illustrate that intuitive, evolved strategies may fail in novel environments.
The engine-versus-fuselage armor story demonstrates how missing evidence can mislead strategic decisions in real-world settings, such as risk assessment and resource allocation.
In education and public discourse, acknowledging missing evidence and seeking dissenting perspectives can prevent overconfidence and promote more robust conclusions.
The argument that there is no replication crisis in psychology emphasizes a shift from crisis-talk to a more mature understanding of context, measurement, and the iterative nature of science.
Key takeaways for exam preparation:
Distinguish between replication failure due to context vs. a true absence of effect.
Recognize the role of biases in interpreting evidence and the importance of seeking dissenting viewpoints.
Apply the two critical-thinking rules to evaluate claims and to identify missing evidence.
Remember famous illustrative cases (Darley & Gross; Newman et al.; Wald; Lord et al.) as concrete examples of how context and beliefs shape interpretation.
Notable Researchers and References Mentioned
Lisa Feldman Barrett – University Distinguished Professor of Psychology; Director of the Interdisciplinary Affective Science Laboratory, Northeastern University; author of "72 Lessons About the Brain".
Francis Bacon – Novum Organum; origin of the scientific method and emphasis on critical thinking.
Willingham (2007) – critique of educational interventions in teaching critical thinking.
Darley & Gross (1983) – belief-driven interpretation of evidence based on presumed background.
Newman et al. (1980) – the missing-evidence trigram experiment demonstrating ease of noticing presence and difficulty of noticing absence.
Lord et al. (1979) – death penalty evidence and belief-driven interpretation.
Koehler (1993) – bias in evaluating scientific evidence.
Hart, Benk, & colleagues (2009); Gesiarz et al. (2019); Kunda (1990) – work on confirmation bias and motivated reasoning.
Del Vicario et al. (2016) – echo chambers in online networks.
Wald (WWII) – armor allocation based on missing evidence.
Henry Gee (Nature) – philosophy of science: doubt as a core methodological tool.
Reproducibility Project: experiments; ext{replication failure rate} > 0.60.TT$$ (never figured out).
Baronial graphs: figure interpretations (Figure 2.12) illustrate misinterpretation when missing data is ignored.
Practical implication: when designing studies, consider context, missing evidence, and potential biases to avoid overgeneralization.
Summary Takeaways
Replication failures can illuminate the role of context, not simply invalidate findings.
Critical thinking requires doubting our own conclusions and actively seeking dissenting perspectives.
Humans are prone to biases that color how we interpret evidence, especially when beliefs or desires are at stake.
Considering missing evidence is essential to avoid erroneous conclusions and to identify new research directions.
Science progresses as a context-sensitive, iterative process rather than as a linear march toward universal, context-free truths.