1/181
Psychology Research Methods cumulative final exam study set
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Empiricism
Using evidence from the senses, or from instruments that assist the senses, as the basis for conclusions; the foundation of the scientific method
Quantitative methods
Research methods that turn empirical observations into numbers (e.g., survey scores, reaction times); contrast with qualitative methods
Qualitative methods
Research methods that create rich descriptions not simplified into numbers (e.g., focus group themes, interview transcripts); contrast with quantitative methods
Falsifiability
The property of a good theory; must be able to lead to hypotheses that, when tested, could actually fail to support the theory; central to the Theory-Data Cycle
Theory-Data Cycle
The scientific process in which theories lead to research questions → research design → data collection → back to refining or supporting the theory; supporting data strengthens the theory, non-supporting data leads to revised theories or improved research design
Basic research
Research conducted to increase the general body of knowledge on a topic, without immediate practical application; contrast with applied research
Applied research
Research conducted to solve practical problems; findings are directly applied to a real-world solution; contrast with basic research
Peer review
The process by which submitted journal articles are evaluated for quality by other experts before publication; a key part of the scientific community's self-correction process
Empirical journal article
Reports the methods and results of a research study for the first time; follows Intro/Methods/Results/Discussion format; primary source for new evidence
Review journal article
Summarizes all published studies from a particular research area; useful for seeing the weight of evidence across many studies
Frequency claim
A claim describing how often a behavior or characteristic occurs; involves ONE variable; external validity is especially important; e.g., "13.2% of non-college young adults use marijuana daily"
Association claim
A claim that two variables are related to each other; involves TWO OR MORE variables; construct and statistical validity are most important; does NOT establish causation
Causal claim
A claim that one variable directly causes changes in another; requires all three causal criteria (covariance, temporal precedence, internal validity); can only be established with true experiments
Variable
Any measured or manipulated characteristic that can take on different values or levels
Levels
The specific values or categories that a variable can take; a variable must have at least 2 levels
Conceptual variable
The theoretical idea or construct being studied (the abstract definition); paired with an operational definition that specifies how it is measured
Operational definition
The researcher's specific decision about how to measure or manipulate the conceptual variable for the study; determines construct validity
Measured variable
A variable that is observed and recorded but not controlled by the researcher; used in correlational studies and as the DV in experiments
Manipulated variable
A variable whose levels are controlled and assigned by the researcher; the IV in an experiment; allows causal claims
Construct validity
How well a conceptual variable is operationalized; does the measure accurately reflect the intended variable? Relevant to ALL THREE types of claims; assessed through reliability and empirical validity evidence (criterion, convergent, discriminant)
External validity
How well the results of a study generalize to people and contexts beyond the study's participants and settings; especially critical for frequency claims; often sacrificed in experiments to gain internal validity; threatened by WEIRD samples and convenience sampling
Statistical validity
The extent to which a study's numerical estimates are reasonable, precise, and replicable; relevant to ALL THREE claim types; improved by replication, larger samples, and confidence intervals that don't contain zero
Internal validity
A study's ability to eliminate alternative explanations for an observed association; relevant ONLY to causal claims and experiments; threatened by confounds, selection effects, and order effects; increased by random assignment
Covariance (causal criterion)
The cause and effect variables must be observed to go together; the first of three criteria for causation; established by showing a significant correlation or group difference
Temporal precedence (causal criterion)
The cause variable must clearly come before the effect variable in time; the second of three criteria for causation; established by experimental manipulation or longitudinal design; threatened by the directionality problem in correlational research
Internal validity (causal criterion)
Must rule out all alternative explanations (third-variable problem) for the relationship; the third criterion for causation; best established by random assignment in a true experiment
Self-report measure
Operationalizes a variable by recording people's answers to questions about themselves in a questionnaire or interview; most important reliability type = internal reliability (Cronbach's alpha); also needs criterion validity to confirm self-reports predict actual behavior
Observational measure
Operationalizes a variable by recording observable behaviors or physical traces of behaviors (aka behavioral measure); most important reliability type = interrater reliability
Physiological measure
Operationalizes a variable by recording biological data (e.g., brain activity, salivary cortisol); requires special equipment; often used alongside self-report and observational measures to triangulate findings
Categorical variable
A variable whose levels are categories with no numerical meaning (aka nominal variable); e.g., first language, experimental condition; numbers assigned are labels only
Ordinal scale
A quantitative scale where numerals represent a ranked order, but intervals between ranks may be unequal; e.g., bestseller rankings — we know #1 outsold #2, but not by how much
Interval scale
A quantitative scale with equal intervals between levels but NO true zero; cannot make ratio statements; e.g., temperature in Celsius — 0° doesn't mean "no temperature"; most questionnaire scales (e.g., Diener's well-being scale) are treated as interval
Ratio scale
A quantitative scale with equal intervals AND a true zero; allows ratio statements like "twice as much"; e.g., number of correct answers on a test, reaction time in ms
Reliability
How consistent the results of a measure are; a measure cannot be more valid than it is reliable (reliability is necessary but not sufficient for validity); three types
Test-retest reliability
Participants receive similar scores when measured at two different time points; important for theoretically STABLE constructs (e.g., personality, IQ); low test-retest is expected for constructs that should change over time; visualized with a scatterplot of Time 1 vs. Time 2 scores
Interrater reliability
Consistent scores are obtained regardless of who is doing the rating; critical for OBSERVATIONAL measures; measured with Pearson r for continuous variables or Cohen's kappa for categorical variables
Internal reliability
Participants give a consistent pattern of answers across multiple items measuring the same construct (aka internal consistency); relevant for MULTI-ITEM SCALES (e.g., Diener's 5-item well-being scale); measured by AIC and Cronbach's alpha
Cohen's kappa
A statistic used when two observers are rating a CATEGORICAL variable; measures the extent to which raters place participants in the same categories; used to establish interrater reliability
Average inter-item correlation (AIC)
The average of all correlations between items in a multi-item scale; values between .15 and .50 indicate items go reasonably well together; used to assess internal reliability
Cronbach's alpha
Combines the AIC and number of items to measure internal reliability; closer to 1.0 = better; ≥.80 desired for self-report measures; the most commonly reported reliability statistic
Correlation coefficient (r)
A number from -1 to +1 indicating the slope (direction) and spread (strength) of a scatterplot relationship; used to measure reliability (test-retest, interrater) and effect size for association claims; near 0 = weak, near ±1 = strong
Face validity
A SUBJECTIVE judgment that a measure looks like it measures what it intends to measure; weakest form of validity evidence; does not require empirical data
Content validity
A SUBJECTIVE judgment that a measure contains all components the theoretical construct should include; requires knowledge of the conceptual definition; e.g., a good IQ test covers multiple categories of intelligence
Criterion validity
An EMPIRICAL validity type; the measure correlates with a relevant behavioral outcome; two types of evidence
Convergent validity
An EMPIRICAL validity type; the measure is more strongly correlated with measures of similar constructs; e.g., a depression scale should strongly correlate with a well-being scale (inverse); usually evaluated alongside discriminant validity as a pattern of correlations
Discriminant validity
An EMPIRICAL validity type; the measure is less strongly correlated with measures of dissimilar constructs (aka divergent validity); e.g., a depression scale should NOT strongly correlate with introversion or phobia scales; evaluated alongside convergent validity
Known-groups paradigm
A method of establishing CRITERION validity by testing whether scores on a measure discriminate between groups whose behavior is already confirmed; e.g., comparing salivary cortisol in people about to give a speech vs. audience members
Bivariate correlation
An association involving exactly TWO variables; visualized with a scatterplot; strength and direction described by r; the foundation of association claims
Effect size
The strength of a relationship between two or more variables; for correlations, described by r; for group differences, described by Cohen's d; context-dependent — even small effect sizes can matter if they compound; average r in psychology is ~.15–.20
Statistically significant correlation
A correlation unlikely to have come from a population where the true association is zero; indicated by confidence intervals that do NOT contain zero; does not automatically mean the effect is large or important
Outlier
An extreme score that stands out from the data and can disproportionately influence r; most problematic when extreme on BOTH variables and when the sample is small; inspect scatterplots to detect
Restriction of range
When there is not a full range of scores on one variable, making the correlation appear smaller than it really is; can be corrected statistically; a threat to statistical validity
Curvilinear association
A relationship between two variables that is not a straight line (e.g., positive then negative); r may be near 0 even when a real relationship exists; always inspect the scatterplot, not just r
Directionality problem
In correlational research, the inability to determine which variable came first (aka reverse causation); threatens temporal precedence — one of the three causal criteria; solved by experimental manipulation or longitudinal cross-lag design
Third-variable problem
A threat to internal validity where an unmeasured variable explains the relationship between two studied variables; one reason correlational studies cannot establish causation; addressed by multiple regression or experimental random assignment
Spurious association
A correlation that exists only because of a third variable; disappears when the third variable is controlled for or when groups are separated; e.g., height and hair length (actually due to gender)
Moderator
A variable whose level affects the relationship between two other variables; relevant to external validity — tells us for WHOM or in WHAT CONTEXT an association holds; contrast with mediator, which explains WHY
Longitudinal design
Measures the same variables in the same people at several points in time; used to establish temporal precedence; produces cross-sectional correlations, autocorrelations, and cross-lag correlations
Cross-sectional correlations
Correlations between two variables measured at the SAME point in time; does not establish temporal precedence
Autocorrelations
Correlations of each variable with ITSELF at two different time points; shows how stable a variable is over time
Cross-lag correlations
Correlations between the EARLIER measure of one variable and the LATER measure of the other; most important for establishing temporal precedence; helps address the directionality problem
Multiple regression
A statistical technique evaluating whether a relationship between two key variables holds when controlling for other variables; helps address the third-variable problem; does NOT establish causation; cannot control for unmeasured variables
Criterion variable
The variable researchers are most interested in predicting or understanding in regression (aka dependent variable); the outcome being explained
Predictor variables
Variables used to explain or predict the criterion variable in regression (aka independent variables); beta values show their relative contributions
Beta (β)
A standardized regression coefficient; indicates direction and strength of a relationship between a predictor and the criterion variable; values within the same table can be directly compared to each other; contrast with unstandardized "b" coefficients
Parsimony
A good theory explains a phenomenon with the fewest exceptions or qualifications; in causal research, the simplest explanation for a pattern of data; e.g., cigarette chemicals cause cancer explains many patterns more parsimoniously than any third-variable alternative
Mediator
A variable that explains the MECHANISM (the "why") through which one variable affects another (aka mediating variable); theoretically meaningful — it is the causal story (A → mediator → B); contrast with moderator (changes the STRENGTH of a relationship) and third variable (an accidental nuisance)
Independent variable (IV)
The manipulated (causal) variable in an experiment; plotted on the x-axis; must have at least two levels; establishes temporal precedence over the DV
Dependent variable (DV)
The measured outcome variable in an experiment; plotted on the y-axis; what changes as a result of the IV
Control variable
A variable held constant by the experimenter to eliminate alternative explanations; increases internal validity; technically not a "variable" since it does not vary
Comparison group
A group in an experiment whose IV level differs from the treatment group in a meaningful way; necessary for establishing covariance (causal criterion 1)
Control group
A level of the IV representing "no treatment" or a neutral/baseline condition (aka control condition)
Treatment group
Participants exposed to the level of the IV involving a medication, therapy, or intervention
Placebo group
A control group exposed to an inert treatment (e.g., a sugar pill); used to rule out the placebo effect as an alternative explanation
Confound
A potential alternative explanation for a research finding; a general threat to internal validity; includes design confounds, selection effects, and order effects
Design confound
A second variable that varies SYSTEMATICALLY with the IV, providing an alternative explanation for results; a threat to internal validity in experiments; eliminated by careful experimental design
Selection effect
A threat to internal validity in INDEPENDENT-GROUPS designs when participants at one level of the IV are systematically different from those at another level; eliminated by random assignment or matched groups
Matched groups
Participants similar on a measured variable are grouped, then randomly assigned to different conditions; controls for selection effects while also reducing noise
Independent-groups design
Each participant experiences ONLY ONE level of the IV (aka between-groups design); threatened by selection effects; controls for order effects
Within-groups design
Each participant is presented with ALL levels of the IV; more powerful (controls individual differences) but threatened by order effects; requires counterbalancing
Posttest-only design
An independent-groups experiment where participants are tested on the DV only once, after the manipulation; simpler but cannot show change over time
Pretest/posttest design
An independent-groups experiment where participants are tested on the DV BOTH before and after the manipulation; allows measurement of change; adds testing threat
Repeated-measures design
A within-groups experiment where participants respond to the DV after each level of the IV; powerful but susceptible to order effects; requires counterbalancing
Concurrent-measures design
A within-groups experiment where participants experience all IV levels at roughly the same time; a single attitudinal or behavioral preference is the DV
Order effect
In within-groups designs, exposure to one condition changes responses to a later condition; threatens internal validity; includes practice effects and carryover effects; eliminated by counterbalancing
Practice effect
A type of order effect where performance improves over time due to experience with the task, NOT the manipulation (also called fatigue effect); eliminated by counterbalancing
Carryover effect
A type of order effect where contamination from one condition carries over to the next; e.g., a drug's residual effects; eliminated by counterbalancing or sufficient washout periods
Counterbalancing
Presenting the levels of the IV in different sequences across participants to control for order effects in within-groups designs; can be full or partial (Latin square)
Full counterbalancing
ALL possible condition orders are represented; only feasible when there are few conditions (e.g., 2 conditions = 2 orders; 3 conditions = 6 orders)
Partial counterbalancing
SOME, but not all, possible condition orders are represented; used when full counterbalancing is impractical; e.g., Latin square
Latin square
A formal partial counterbalancing system ensuring every condition appears in each position at least once; efficient for larger numbers of conditions
Demand characteristic
A cue that leads participants to guess a study's hypotheses or goals, changing their behavior accordingly; a threat to internal validity; reduced by masked/double-blind designs (aka experimental demand)
Manipulation check
An extra DV included to verify that the IV manipulation actually worked as intended; if the manipulation check fails, the experiment's results are hard to interpret
Pilot study
A study completed before the main study to test and refine the effectiveness of manipulations; improves construct validity of the IV
One-group pretest/posttest design
A researcher tests one group before and after a treatment with NO comparison group; vulnerable to maturation, history, regression, testing, and instrumentation threats
Maturation threat
An observed change could have emerged spontaneously over time regardless of treatment; relevant to pretest/posttest and quasi-experimental designs; e.g., children improving in reading naturally over a school year
History threat
An external or historical event (not the treatment) explains a change in the treatment group; most relevant to one-group designs and interrupted time-series designs; e.g., a public health campaign running at the same time as an intervention
Regression to the mean
Extreme findings tend to be closer to the mean on retesting because the same chance factors are unlikely to repeat; a natural statistical phenomenon, not caused by treatment
Regression threat
A threat to internal validity in pretest/posttest designs where extreme pretest scores naturally move toward the mean at posttest, making it look like the treatment worked; related to regression to the mean
Attrition threat
A systematic type of participant drops out before the study ends; threatens internal validity if dropout is related to the IV or DV; e.g., sickest patients dropping out of a drug trial