1/176
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
what is the definition of psychology
the scientific study of mind and behaviour, studied systematically.
why does psychology exists as a scientific discipline
because human intuition is powerful, persuasive and predictably biased
common sense very good at explaining events after they happen but unreliable at predicting them in advance. What is this an example of?
post-hoc reasoning that feels explanatory but lacks predictive validity
confirmation bias refers to
the tendency to seek out information that confirms what we already believe, and overlook information that challenges it
what does it mean when we say that psychological conclusions are ‘probabilistic’?
psychological findings state what tends to happen under particular conditions, with important limitations
science is best characterised as
risking being wrong in a controlled and informative way
what is one reason psychology is described as a difficult discipline
people sometimes change their behaviour simply because they know they are being studied
what is the structure for scientific reasoning
claim → evidence → inference
what does the Haidt and Orben example illustrate
the same evidence, filtered through different assumptions, can support different conclusions.
where do Haidt and Orben primarily differ ?
in how they interpret what the evidence justifies us in concluding → that is, the inference
why is one study never enough to establish a scientific claim?
because every study uses a particular sample, context, and measures that constrain what the evidence can justify
operationism is psychology refers to
the requirement that abstract psychological concepts be tied to specific, measurable operations
the role of statistics is best described as
making judgement more disciplined by helping evaluate uncertainty
according to Nickerson, confirmation bias is
a ubiquitous phenomenon that occurs across many contexts and domains
The Nickerson reading on confirmation bias is most relevant to which idea from the Week 1A lecture?
the tendency to seek confirming evidence and overlook disconfirming evidence
According to Nosek and Bar-Anan, what is the primary objective of public science?
to build a shared body of knowledge about nature through open communication
Nosek and Bar-Anan argue that truth in science emerges as a consequence of:
public scrutiny and open communication among the scientific community
According to Nosek and Bar-Anan, the main barriers to reforming scientific communication are
social → existing incentives and norms resist change even when better systems are available
Nosek and Bar-Anan identify several problems with the current scientific communication system. Name one.
the system is rooted in 17th century technologies that limit communication speed, completeness and access,
According to Nosek and Bar-Anan, what is an example of a problem caused by selective reporting in the current scientific communication system?
studies with null or negative results are less likely to be published, distorting the scientific record.
why does Auguste Comte argue that psychology is not a science
because it is rooted in deep explanation (which is not the basis of science). It should focus on observations and measurable items.
what did Karl Popper argue
that what matters is not whether a theory can explain many things, but whether it can be proven wrong.
when a theory explains everything it ends up explaining nothing
why can psychology not rely on intuition alone
because of confirmation bias
What does Shadish et al.'s counterfactual framework require in order to claim that X caused Y?
Y would have been different if X had not occurred, all else being equal
what does it mean when we say that the phrase "correlation does not imply causation" is "dangerously incomplete."
an observed association between X and Y is compatible with multiple competing causal stories.
what threat to causal inference does the ice cream and sunburn example illustrate?
confounding, because hot weather predicts both ice cream consumption and sun exposure.
Students who attend optional exam preparation workshops perform worse on the final exam than non-attendees. Which threat to causal inference most likely explains this pattern?
selection effects, because academically struggling students are more likely to self-select into the workshops.
A university introduces a wellbeing program for first-year students, and reported stress levels decline over the semester. Which two alternative explanations make it difficult to attribute this decline to the program?
Maturation effects (students naturally adjusting to university life) and history effects (other events also occurring during the semester)
why is it that "statistics cannot rescue a design that does not support causal inference"?
statistics can adjust only for variables that were measured; they cannot create the missing counterfactual or remove unmeasured confounding
what is the most important guarantee provided by random assignment in an experiment?
differences between groups are not systematic → they are not driven by participants’ pre-existing characteristics.
a randomized experiment uses a self-report outcome measure completed by participants who know which condition they were in. What threat does random assignment NOT protect against here?
measurement bias, because participants’ self-reports may be influence by their awareness of their assigned condition
what specific threat to causal inference is most directly addressed by the fact that a researcher assigns the treatment before measuring the outcome?
reverse causation → because the outcome cannot have caused the manipulation
how does the basis for causal inference in a quasi-experiment differ from that in a randomized experiment?
the researcher’s argument → that the comparison group plausibly approximates what random assignment would have produced → substitutes for the design feature of random assignment
according to Shadish et al., what is an inus condition
an insufficient but non-redundant part of an unnecessary but sufficient condition
According to Shadish et al.'s counterfactual model, how is an "effect"
the difference between what did happen when people received the treatment and what would have happened had they not received it
According to Mill's analysis of causal relationships, as described by Shadish et al., what are Mill's three required conditions for a causal relationship?
no plausible alternative explanation for the effect, other than the presumed cause, can be found
temporal precedence → the cause must occur before the effect in time,
covariation → the presumed cause and effect must be related (ie, when the cause changes, the effect changes)
According to Table 1.1 in Shadish et al., what is the defining feature that distinguishes a randomized experiment from other experimental designs?
units are assigned to receive the treatment or a comparison condition by a random process such as a coin toss or table of random numbers
According to Shadish et al., what is the key distinction between causal description and causal explanation?
causal description identifies the consequences attributable to a treatment
causal explanation clarifies the mechanisms and conditions through which that causal relationship holds.
what is a counterfactual claim
a statement about what would have happened if some condition had been different, even though that condition did not actually occur → we never directly observe the counterfactual. The effect that wouldn’t have happened.
all causal inference is an attempt to approximate the counterfactual
what are the four threats to causal inference
confounding
selection effects
reverse causation
history/maturation effects
what is confounding
the effect of a presumed cause cannot be separated from the effect of another variable that is related to both the cause and the outcome. There is a third variable explanation. Here, causal attributions become ambiguous.
Confounding is not something that appears after data are collected, it is baked into the design of the study. Statistics can sometimes adjust confounders but it cannot guarantee all relevant confounders have been measured or controlled.
what are selection effects
when the groups differ before the cause. Differences in outcomes between groups reflect pre-existing differences in who ends up in each group rather than the effect of the treatment or exposure itself.
Selection effects focus on how people enter conditions in the first place, who is being compared to whom → often co-exists with confounding effects.
To estimate the causal effect of X, the group exposed to X must be comparable to the group not exposed to X → selection effects breaks this estimation.
what is reverse causation
we observe a relationship between X and Y, we assume X causes Y, but in fact Y may be causing X. This is most common in observational research. Reverse causation breaks our intuition.
If temporal order is unclear, causal direction is unclear. To say X caused Y, Y must depend on X occurring first.
Reverse causation undermines causal inference because the observed association is compatible with two opposing causal stories (the design does not account for which variable came first and the counterfactual comparison becomes ambiguous)
what are history and maturation effects
when we have a change over time that is not due to X. Something happens in the world, it coincides with the treatment, it plausibly affects the outcome, and it is not part of the treatment itself. When this happens, causality becomes ambiguous.
These are easy to miss because they often feel like the background or they’re contextual and they occur outside the researcher’s control. If treatment and history move together, we cannot separate their effects. To estimate the effect of X, we need to know what would have happened at the same time without X.
what can statistics do
estimate an association
quantify uncertainty
adjust for variables you measured
what are the three levels of experimental design
randomised experiments
quasi-experiments
correlational designs
Experiments buy you leverage. Quasi experiments buy you realism if the comparison is credible. Correlations give you patterns worth investigating. None give you everything.
experiments help because they change the causal logic.
what are the three components of experiments that help eliminate alternative explanations
manipulation
comparison groups
random assignment
what does manipulation ensure
manipulation creates causal asymmetry. The researcher sets the value of something (Ie. the intervention, treatment) which builds in temporal direction. The outcome cannot cause the manipulation. This is a causal upgrade.
what do comparison groups ensure
comparison groups approximate the counterfactual. The comparison group represents what would have happened otherwise. The closer the groups are at baseline, the better the counterfactual approximation.
what does random assignment ensure
random assignment reduces systematic alternatives. Random assignment is powerful because it breaks the link between who you are and what condition you end up in. It guarantees that differences between groups are not systematic.
They’re not driven by the group formation process. Random assignment makes confounding and selection explanations less plausible.
what are quasi-experiment s
designs that exploit real world variation to approximate causal logic without random assignment → you need to find a comparison that makes the counterfactual more plausible than just correlation (sometimes its a policy change, timing, or a natural event).
in a quasi experiment your strongest argument is you reasoning. You have to make the case that your comparison group approximates what random assignment would have produced.
Why can we never directly observe the counterfactual for the same individual at the same time?
We cannot directly observe a counterfactual because generally, an individual can only receive one treatment at one specific, fixed point in time. There could be multiple causes and subsequent effects we are trying to investigate, each of which implies a different counterfactual. For example, Shadish proposed the issue faced with investigating PKU
why are positive findings more publishable than negative findings
positive findings can create a stronger causal claim
it is also due to publication bias → more people will click on experiments that found good results.
how might selective publication distort the literature on causal claims?
researchers won’t write up what they did (method process etc.) → this creates the file drawer effect, whereby negative results are more likely to be reported resulting in an inflated false positive rate (limiting legitimate claims)
how is a construct described in psychological research?
an abstract theoretical property that is not directly observable, such as anxiety or resilience
A researcher decides to measure "academic motivation" by recording the number of hours each student spends in the library per week. This decision is an example of
operationalisation
A depression questionnaire consistently produces similar scores for stable patients over time, but an expert review concludes the items reflect general distress rather than depression specifically. In terms of reliability and validity, this scale has
high reliability and low validity
Cronbach's alpha (α) is primarily a measure of
the internal consistency of items within a scale
Studies A and B both claim to measure "happiness" but reach opposite conclusions about whether parents are happier than non-parents. Study A uses momentary mood sampling; Study B uses global life-satisfaction ratings. The most likely explanation to the different conclusions is
the two operationalisations capture different aspects of happiness
what could be a possible finding that would best demonstrate convergent validity for a new loneliness scale?
a strong positive correlation with an established social isolation measure
A researcher wants to show discriminant validity for a new academic self-efficacy scale. What would be an outcome that provides the best evidence.
the scale correlates only weakly with a broad measure of general self-esteem
Sensitivity to change (responsiveness) refers to a scale's ability to
detect genuine change in the underlying construct over time
The jingle fallacy occurs when (and give an example)
the same label is used to describe constructs that are empirically distinct
For example, anxiety → we could be referring to state anxiety, trait anxiety, social anxiety, physiological arousal. All of these are not interchangeable, they have different causes, different consequences, and different relationships. Often, both studies use the word anxiety in the title and abstract.
the jangle fallacy occurs when (and give an example)
we use different labels for things that are essentially the same, and assuming they must be different because they sound different.
For example, grit was described as its own construct. However, when researchers looked at the measures, grit correlated so strongly with conscientiousness that it was difficult to argue the two were meaningfully distinct.
A newly developed "perseverance" scale correlates r = .91 with the established Grit Scale. This is an example of
the jangle fallacy, because two different labels appear to refer to the same construct
According to Flake et al. (2017), "arbitrary operationalism" refers to
making measurement choices without principled theory based justification
What was the central finding of Flake et al.'s (2017) analysis of scales published in a leading personality and social psychology journal?
for the majority of scales, Cronbach’s alpha was the only psychometric evidence reported.
what is Flake et al.’s three phase framework
the substantive phase, which evaluates whether items reflect the construct’s theoretical definition and whether a scale's items adequately represent the full theoretical breadth of the construct
the structural phase, which uses factor analysis to confirm item grouping and examine the psychometric properties of the measure
the external phase, which tests the scale’s relationships with other constructs and outcomes.
Flake et al. found that nearly half of the scales in their sample had no citation. What is the significance of this finding?
many published scales were used with no prior validation evidence provided
Flake et al. argue that arbitrary operationalism contributes to the replication crisis in psychology. Which mechanism do they describe?
studies using the same construct label but different measures may fail to replicate due to measurement inconsistency, not because the original finding was false.
what were the four main findings of the Flake et al. paper
nearly half the scales were developed on the fly. They appeared to have been created by the authors for that particular with little or no evidence that the scores actually measured what they claimed to measure → no track record of validity
Cronbach’s coefficient alpha offered as the sole evidence of validity. This is treating alpha as evidence of unidimensionality, which it isn’t. Unidimensionality is a prerequisite for computing alpha not something alpha can demonstrate.
many researchers adapted or modified previous scales without new validation.
many scales were ‘short’ scales (three items or fewer). Cannot capture the specific breadth of the construct they want to measure
what is reliability
reliability means consistency → if the underlying thing hasn’t changed, your measure should give similar results. If your measure is unreliable, you have a basic problem → reliability is a minimum requirement. Reliability is necessary but it is never sufficient.
what is validity
validity is whether you’re measuring what you claim → The evidence based claim that a measure actually captures the construct it’s intended to capture rather than something else. This can be difficult because constructs are not directly observable → there is no ground truth to compare against.
validity is built through argument and evidence. It is provisional
what is discriminant validity
ensures that tests or measures that are not supposed to be related are, in fact, not related, confirming that a construct is distinct from others
what is convergent validity
the degree to which two or more measures (e.g., tests, items, scales) that theoretically should be related are, in fact, strongly related
where does the causal inference (measurement) problem exist?
the gap between the invisible things you care about and the data you collect is where measurement problems exist.
What defines an observational study in research design?
the researcher records what occurs in the world without intervening to change it
In a survey, participants are asked how often they exercise per week. They systematically overestimate. This is best described as an example of
social desirability bias, because exercise is socially valued and people tend to present themselves favourably.
what specific threat to causal inference does random assignment address?
confounding, by distributing pre-existing differences across conditions rather than concentrating them in one group
what is one strength of observational research that controlled laboratory experiments often lack?
the capacity to capture behaviour as it actually occurs in real world contexts, without artificial constraints
A laboratory experiment uses psychology undergraduates completing an artificial word-list memorisation task. The study has high internal validity but raises concern about a different form of validity. Which one, and why?
external validity, because the restricted sample and artificial task may not generalise to other populations and real world memory
A large cross-sectional survey finds that people who exercise regularly report lower depression scores. Why can this finding alone not support the causal claim that exercise reduces depression?
regular exercisers may differ from non-exercisers on many unmeasured variables → income, sleep, or personality → that could independently explain lower depression
A lab experiment finds that a mindfulness exercise reduces self-reported anxiety. A longitudinal survey separately finds that regular mindfulness practitioners report less anxiety over time. What does convergence across these two designs provide?
a stronger basis for the causal claim, because each design’s weaknesses are offset by the other’s strengths
A researcher surveys 2,000 Australian adults and reports that 42% experience moderate or higher work-related stress. According to the inference ladder this is best classified as
a descriptive claim, because it states that the prevalence of a characteristic in a sample without examining relationships between variables.
A cross-sectional survey finds that people who use social media more frequently report lower life satisfaction. The headline reads: "Social media lowers life satisfaction.” What is the key problem with this framing?
the word ‘lowers’ implies causation, but the observational design only supports an associative claim.
A longitudinal study measures screen time and depression in adolescents at two time points a year apart. Higher screen time at Time 1 predicts higher depression at Time 2. A researcher concludes: "Screen time contributes to adolescent depression." What inferential limitation remains?
temporal order shows screen time preceded depression, but confounding variables could still explain the association.
Applying the three diagnostic questions, what does the absence of manipulation tell you about a study's causal inference?
reverse causation cannot be ruled out, because the researcher did not control which variable came first
A randomised experiment assigns university students to a two-week social media break or normal continued use. The break group reports lower anxiety afterward. The paper concludes: "Reducing social media use improves mental health." What type of claim inflation is present?
scope inflation, because ‘mental health’ is a far broader claim than the single self-reported anxiety measure used in the study
A cross-sectional survey of 800 employees finds that high-mindfulness employees report lower burnout. What is the most accurate characterisation of the inference that mindfulness "reduces" burnout?
not justified, because the cross-sectional design cannot rule out confounders or establish which variable preceded the other
what does calibrated confidence require when evaluating a research finding?
matching the degree of belief in a claim to what the study’s design actually supports → neither dismissing all evidence nor accepting it uncritically.
what are some vulnerabilities of self-report
social desirability → people present themselves favourably.
memory distortion → memory is a reconstruction (memories are shaped by attitude and mood)
the gap between perception and behaviour
question design effects → how you word a question shapes what you measure.
what are some advantages of observational studies
ecological validity → the extent to which the study represents what happens in the real world (more likely when you capture behaviour as it actually happens. Not under artificial conditions or contrived instructions)
ethics and feasibility
discovery → how science generates questions (often where science begins)
what is internal validity
confidence that the manipulation caused the outcome
what is external validity
confidence that the findings generalise beyond the study
what is triangulation
when multiple designs converge on the same conclusion, confidence increases → the strengths of each design compensate for the weakness of the others.
what is scientific knowledge
an accumulation of imperfect studies whose weaknesses don’t overlap.
what is language inflation
when the words used to describe a finding imply a stronger claim than the design supports → this is the most common critical thinking failure
how do the three diagnostic questions help determine where you are on the inference ladder?
all three present → causal claims
two present → associative claims (quasi-experiment)
none present→ descriptive claims (observational)
why were scientific ethical rules developed
Written in response to specific harms → during WW2, doctors working in Nazi concentration camps conducted experiments on prisoners without consent, or regard for suffering.
They justified the work as medical science → at the Nuremberg trials the tribunal introduced a set of principles for what makes human research ethically permissible:
what was the Helsinki Declaration
the Nuremberg code was not enough on its own as it had surfaced that ethically problematic research was still being conducted.
The next set of principles was issued by the World Medical Association (WMA Declaration of Helsinki) → this addressed components that were left out of the Nuremberg Code (ie. difference between therapeutic and non-therapeutic research, and researcher responsibilities)