Notes on Hypothesis Testing and Variables (PSYC 2101)
Hypothesis Testing and Variables: Comprehensive Study Notes
Core idea: In behavioral sciences, we study relationships between variables to determine whether one variable predicts or causes changes in another. This unfolds through concepts of variable types, research design, hypothesis testing, and data ethics.
The Two Branches of Statistics
Descriptive statistics: organize, summarize, and communicate a group of numerical observations.
Examples: computing a mean length, summarizing a dataset with a few numbers.
Inferential statistics: use sample data to make estimates or inferences about a larger population.
Examples: estimating population parameters from a sample, drawing conclusions beyond the observed data.
Important framing: Most real-world data come from samples, not entire populations; the goal is to generalize with appropriate uncertainty.
Observations and Variables: Four Types of Variables (Table 1-1 concept)
Variables are observations of characteristics that can take on different values. They can be discrete or continuous.
Discrete observations can take only certain values (often whole numbers).
Continuous observations can take on an infinite set of values (subject to measurement precision).
Four main types of variables used to quantify observations:
Nominal: categories or names; discrete; examples include nationality, gender categories (when coded).
Ordinal: rankings; discrete; examples include finishing place in a race (1st, 2nd, 3rd, …).
Interval: numeric values with equal intervals between adjacent values; can be discrete or continuous; no true zero point (e.g., temperature in Celsius/Fahrenheit).
Ratio: numeric values with equal intervals and a meaningful zero (e.g., distance, weight, reaction time).
In statistics, scale variables often refer to interval or ratio variables (continuous).
Quick-reference (typical coding):
Nominal: Discrete — Always; Continuous — Never
Ordinal: Discrete — Always; Continuous — Never
Interval: Discrete — Sometimes; Continuous — Sometimes
Ratio: Discrete — Seldom; Continuous — Almost always
Practical takeaway: Nominal and ordinal variables are inherently discrete; interval and ratio variables are (potentially) continuous and are treated as scale variables in many analyses.
Transforming Observations into Variables (Key Concepts)
Predictor (independent) variable: the variable that is manipulated or observed to predict or cause changes in the outcome.
Outcome (dependent) variable: the response measured to assess the effect of the predictor.
Confounding variable: a variable that systematically varies with the predictor and can obscure which variable actually affects the outcome.
Reliability vs validity:
Reliability: consistency of a measure across time or raters (e.g., bathroom scale yielding the same weight on repeated trials).
Validity: whether the measure actually assesses what it intends to assess (e.g., whether a taste measure truly reflects taste ability or a construct).
An operational definition specifies the exact procedures used to measure or manipulate a variable.
A good measure is typically both reliable and valid; a measure can be reliable but not valid (consistent but measuring the wrong thing).
Examples:
The “What Dog Breed Are You?” online quiz illustrates reliability/validity concerns when results vary across attempts.
Rap Genius’s rhyme density as a metric for rapper quality:
Rhyme density RD defined as: RD = \frac{N{rhymes}}{N{syllables}}
Eminem’s Without Me: RD = 0.49; Notorious B.I.G.’s Juicy: RD = 0.23
Ranking of rappers is ordinal; density is a ratio/ratio-like continuous variable.
Practical note: In many social science contexts, researchers operationalize abstract concepts (e.g., “loneliness,” “ability”) into concrete measures (survey items, test scores, behavioral counts).
Predictor, Outcome, and Confounding Variables: Examples
Wellness programs example (illustrative):
Predictor: access to a wellness program (yes vs no) or the level of access.
Outcome: health outcomes (e.g., exercise frequency, health costs, days absent).
Confounds: preexisting healthy behaviors that correlate with program use; cannot justify causality without random assignment.
Pet ownership example (1.29):
Predictor: pet ownership (none vs at least one pet).
Outcome: loneliness (operationalized via questionnaire).
Levels: pet ownership (2 levels); social activity (2 levels).
Food/wellness data (Snow’s cholera) highlight the classic confound: proximity to wells, water contamination as an upstream predictor of cholera deaths.
Hypothesis Testing: Core Process
Hypothesis testing aims to determine whether the observed relation between variables is supported by the data, given sampling variability.
Key concepts:
Operational definitions of the independent (predictor) and dependent (outcome) variables are necessary to test hypotheses.
Correlational studies examine relations between naturally occurring variables but cannot establish causality.
Experiments test causal relations by manipulating an independent variable and randomly assigning participants to conditions, which helps control confounds.
Correlational vs experimental goals:
Correlational studies describe associations and identify potential relationships.
Experiments allow causal inference when random assignment ensures equivalent groups on average.
Between-Groups vs Within-Groups Designs
Between-groups design:
Participants experience one and only one level of the independent variable (e.g., wellness access vs no access).
Example: random assignment of employees to have access to a wellness program vs not.
Within-groups design (repeated-measures):
The same participants experience multiple levels of the independent variable (e.g., pre/post measures, or exposure to different video speeds).
Advantage: controls for between-subject variability; disadvantage: potential carryover effects and confounds if not properly counterbalanced.
Practical notes:
Longitudinal designs are often within-groups (participants followed over time).
Practical constraints may force correlational designs when random assignment or manipulation is unethical or impractical (e.g., hurricane exposure, smoking, or demographic variables like gender).
Ethics in data collection and randomization are central to designing valid studies.
Data Ethics, Open Science, and preregistration
Data ethics: principles guiding data collection, analysis, interpretation, and reporting to reduce bias and errors and promote transparency.
Open science: a broader movement toward sharing materials, data, and analysis plans to allow replication and critique.
Preregistration: committing to a study design and analysis plan before collecting data; time-stamped record helps prevent HARKing (Hypothesizing After the Results are Known).
HARKing analogy: the Texas sharpshooter parable describes how retargeting hypotheses after seeing results can inflate false-positive rates; preregistration mitigates this issue.
Severe testing: a concept from philosophy of science (Karl Popper; Deborah Mayo) describing exposing hypotheses to rigorous tests to reveal weaknesses; helps improve robustness and credibility of findings.
Data ethics in practice:
Many educational examples discuss how studies may be preregistered, and how to avoid questionable practices like selective reporting or flexible analysis paths.
Open data and transparent reporting reduce replication failures and questionable research practices.
Reliability, Validity, and Measurement Quality
Reliability: consistency of a measurement over time or across raters.
Validity: whether a measurement actually measures what it claims to measure.
Helpful analogies:
A bathroom scale that shows the same weight over time is reliable but not necessarily valid if it’s systematically off from true weight.
A “What Dog Breed Are You?” quiz could be reliable (similar results across trials) but not valid as a personality measure.
Practical takeaway: A good instrument should be both reliable and valid to support trustworthy conclusions.
Operationalizing Variables: Examples from the Transcript
Operationalizing investor/artist popularity (Table examples):
“Artist popularity” could be operationalized via Billboard album sales, concert attendance, TikTok followers, etc.
Operationalizing earnings for comedians (Forbes example):
Operational definition used: pretax gross income; primary income source must come from concert ticket sales.
Critics argue the caveat is narrow; broader definitions could change rankings (e.g., Ellen DeGeneres, Mindy Kaling examples).
Operationalizing the predictor/outcome in various contexts (examples provided in the transcript):
The predictor variables for loneliness could include pet ownership (none vs at least one) and social activity (never vs at least once).
For the wine-rating study, predictor: weather (temperature, rainfall); outcome: expert ratings (discrete scale).
Reports of charities and donation metrics:
Charity Navigator vs GiveWell: Charity Navigator uses financial metrics and accountability; GiveWell uses broader criteria including impact and cost-effectiveness.
Tier vs score: tier is ordinal; score (57.11/70) is a scale measure.
Real-World Examples and Case Studies (From Transcript)
Kentucky Derby variable types (1.34):
Finishing position: ordinal (1st, 2nd, 3rd)
Finishing time: interval/ratio (seconds with decimals) — scale variable
Derby attendance: count/number (ratio, but could be treated as scale)
Payoffs per $2 bet: numerical; scale (continuous)
Jockey demographics (race/gender): nominal categories; discrete
Jockey demographics across venues (PSYC 2101 context): potential socio-demographic research
1.26 A study of average distance walked by 130 urban Indians per week:
a) Sample size: 130
b) Population: all urban residents in India
c) Is the average descriptive or inferential? Inferential if used to generalize beyond the 130.
d) Operationalizing average distance weekly as ordinal: e.g., categorize into bins (0–1 km, 1–3 km, etc.).
e) Operationalizing as scale: measure exact distance in kilometers.
HARKing and preregistration (1.45, 1.19, 1.18):
HARKing explained as modifying hypotheses after results are known to fit results; preregistration helps prevent this.
Open science and data mishaps (1.44):
Cases like Wansink scandal illustrate the importance of transparency, data sharing, and robust data practices.
RapMetrics and rhyme density (1.38):
Operationalizing rapper quality via rhyme density: RD = Nrhymes / Nsyllables; examples for Eminem, Notorious B.I.G., MF Doom, Cam’ron.
Ranking is ordinal; rhyme density is a continuous variable.
Experimental vs correlational research with vaping and health (1.43):
Correlational: observing vaping and health outcomes without random assignment; confounds like other health behaviors.
Experimental: possible to assign vaping exposure to test causal effects (ethics aside, often not feasible).
The Wellspring of data ethics and operator definitions (1.9, 1.10, 1.12):
Operational definitions are central to testing hypotheses; predictor vs outcome definitions clarify what is being tested.
Typical Questions and Concepts (Selected Q&A from the Transcript)
Difference between descriptive vs inferential statistics:
Descriptive: summarize data from a sample.
Inferential: use sample data to infer about a population.
Difference between a sample and a population:
Sample: the observed subset; population: the entire group of interest.
What is preregistration?
A time-stamped plan detailing research design and analysis before data collection.
What is HARKing?
Hypothesizing after the results are known; preregistration helps prevent.
Reliability and validity distinctions:
Reliability: consistency; validity: accuracy to what is intended.
Predictor vs confounding variable decisions:
Predictor is what you test to see if it predicts the outcome; confounds threaten causal interpretation unless controlled.
Connections to Foundational Principles and Real-World Relevance
Causality vs correlation remains a central concern in policy-making and workplace health programs.
Open science and preregistration strengthen credibility of findings used to inform practice and policy.
Operationalization matters: the way a variable is defined directly affects what relationships we observe and how we interpret them.
Ethical considerations in data collection, reporting, and replication shape the integrity of psychological science.
Real-world applications include evaluating wellness programs, charitable give/get metrics, and consumer behavior analytics; these rely on rigorous design to distinguish causation from association.
Quick References to Key Terms (Glossary Snippets)
Predictor variable (independent variable): manipulated or observed to predict an outcome.
Outcome variable (dependent variable): the response measured in relation to the predictor.
Confounding variable: varies with the predictor and may influence the outcome, threatening causal conclusions.
Descriptive statistics: summarize sample data.
Inferential statistics: generalize from sample to population.
Reliability: consistency of a measurement.
Validity: accuracy of a measurement in measuring what it intends to measure.
Operational definition: concrete procedures to measure or manipulate a variable.
Between-groups design: different participants at each level of the IV.
Within-groups design: same participants experience multiple levels of the IV.
Hypothesis testing: evaluating whether observed data support a hypothesis.
Preregistration: pre-specifying hypotheses and analysis plans before data collection.
HARKing: hypothesizing after results are known.
Severe testing: rigorous evaluation of a hypothesis to reveal weaknesses.
Open science: sharing data, methods, and analyses to enable replication.
Scale variable: another term for interval/ratio variables; continuous by nature.
Ordinal variable: ranked, discrete values.
Nominal variable: categorized names without implied order.
Interval vs ratio nuance: interval has equal distances but no true zero; ratio has a meaningful zero.
Summary Takeaways
Understanding variable types (nominal, ordinal, interval, ratio) and their discrete/continuous nature is foundational for choosing analyses.
Distinguish predictor (independent) vs outcome (dependent) variables; guard against confounding variables, ideally via random assignment.
Experimental designs (between- and within-groups) are preferred for causal claims; correlational designs reveal relationships but not causality.
Reliability and validity are essential for trustworthy measurement; a reliable measure can be invalid, but valid measures must be reliable.
Data ethics and open science practices, including preregistration and severe testing, improve replicability and credibility of findings.
Operational definitions are central to hypothesis testing and interpreting results in real-world research.
Hypothesis Testing and Variables: Comprehensive Study Notes
Core idea: In behavioral sciences, we study relationships between variables to determine whether one variable predicts or causes changes in another. This unfolds through concepts of variable types, research design, hypothesis testing, and data ethics.
The Two Branches of Statistics
Descriptive statistics: organize, summarize, and communicate a group of numerical observations.
Examples: computing a mean length, summarizing a dataset with a few numbers.
Inferential statistics: use sample data to make estimates or inferences about a larger population.
Examples: estimating population parameters from a sample, drawing conclusions beyond the observed data.
Further detail: Inferential statistics involve probability theory to assess the likelihood that observed sample differences or relationships exist in the broader population, often using concepts like p-values and confidence intervals.
Important framing: Most real-world data come from samples, not entire populations; the goal is to generalize with appropriate uncertainty.
Observations and Variables: Four Types of Variables (Table 1-1 concept)
Variables are observations of characteristics that can take on different values. They can be discrete or continuous.
Discrete observations can take only certain values (often whole numbers); these are typically counted.
Continuous observations can take on an infinite set of values within a given range (subject to measurement precision); these are typically measured.
Four main types of variables used to quantify observations:
Nominal: categories or names; discrete; examples include nationality, gender categories (when coded).
Analytical implication: Only frequency counts and modes are meaningful. No inherent order or numerical value.
Ordinal: rankings; discrete; examples include finishing place in a race (1st, 2nd, 3rd, …).
Analytical implication: Provides order but not equal intervals between ranks (e.g., the difference between 1st and 2nd might not be the same as 2nd and 3rd). Medians are often appropriate.
Interval: numeric values with equal intervals between adjacent values; can be discrete or continuous; no true zero point (e.g., temperature in Celsius/Fahrenheit).
Analytical implication: Allows for addition and subtraction, and calculation of means, but ratios are not meaningful due to the lack of a true zero.
Ratio: numeric values with equal intervals and a meaningful zero (e.g., distance, weight, reaction time).
Analytical implication: Supports all arithmetic operations (addition, subtraction, multiplication, division), allowing for meaningful ratios (e.g., 10 kg is twice as heavy as 5 kg).
In statistics, scale variables often refer to interval or ratio variables (continuous).
Quick-reference (typical coding):
Nominal: Discrete — Always; Continuous — Never
Ordinal: Discrete — Always; Continuous — Never
Interval: Discrete — Sometimes; Continuous — Sometimes
Ratio: Discrete — Seldom; Continuous — Almost always
Practical takeaway: Nominal and ordinal variables are inherently discrete; interval and ratio variables are (potentially) continuous and are treated as scale variables in many analyses.
Transforming Observations into Variables (Key Concepts)
Predictor (independent) variable: the variable that is manipulated or observed to predict or cause changes in the outcome. In experiments, this is the variable the researcher controls.
Outcome (dependent) variable: the response measured to assess the effect of the predictor. It is expected to change in response to the independent variable.
Confounding variable: a variable that systematically varies with the predictor and can obscure which variable actually affects the outcome. It provides an alternative explanation for an observed relationship.
Impact: Confounding variables make it difficult to establish a clear cause-and-effect relationship because the observed effect could be due to the confound rather than the intended predictor.
Reliability vs validity:
Reliability: consistency of a measure across time or raters (e.g., bathroom scale yielding the same weight on repeated trials).
Types of reliability: Includes test-retest reliability (consistency over time), inter-rater reliability (consistency across different observers), and internal consistency (consistency among items within a scale).
Validity: whether the measure actually assesses what it intends to assess (e.g., whether a taste measure truly reflects taste ability or a construct).
Types of validity: Includes content validity (does it cover all aspects of the construct?), criterion validity (does it correlate with other relevant measures?), and construct validity (does it measure the theoretical construct it's supposed to?).
An operational definition specifies the exact procedures used to measure or manipulate a variable, making abstract concepts concrete and measurable.
A good measure is typically both reliable and valid; a measure can be reliable but not valid (consistent but measuring the wrong thing).
Examples:
The “What Dog Breed Are You?” online quiz illustrates reliability/validity concerns when results vary across attempts.
Rap Genius’s rhyme density as a metric for rapper quality:
Rhyme density RD defined as: RD = \frac{N{rhymes}}{N{syllables}}
Eminem’s Without Me: RD = 0.49; Notorious B.I.G.’s Juicy: RD = 0.23
Ranking of rappers is ordinal; density is a ratio/ratio-like continuous variable.
Practical note: In many social science contexts, researchers operationalize abstract concepts (e.g., “loneliness,” “ability”) into concrete measures (survey items, test scores, behavioral counts).
Predictor, Outcome, and Confounding Variables: Examples
Wellness programs example (illustrative):
Predictor: access to a wellness program (yes vs no) or the level of access.
Outcome: health outcomes (e.g., exercise frequency, health costs, days absent).
Confounds: preexisting healthy behaviors that correlate with program use; cannot justify causality without random assignment. Random assignment is crucial because it ensures that, on average, groups are similar on all other variables (known and unknown) at the start of the experiment, thus isolating the effect of the predictor.
Pet ownership example (1.29):
Predictor: pet ownership (none vs at least one pet).
Outcome: loneliness (operationalized via questionnaire).
Levels: pet ownership (2 levels); social activity (2 levels).
Food/wellness data (Snow’s cholera) highlight the classic confound: proximity to wells, water contamination as an upstream predictor of cholera deaths.
Hypothesis Testing: Core Process
Hypothesis testing aims to determine whether the observed relation between variables is supported by the data, given sampling variability.
Null Hypothesis (H_0): States there is no effect or no relationship between variables in the population (e.g., the mean of group A is equal to the mean of group B, or correlation is zero).
Alternative Hypothesis (H1 or Ha): States there is an effect or relationship (e.g., the mean of group A is not equal to group B, or correlation is not zero). This is often the researcher's prediction.
Statistical Significance: A result is statistically significant if the p-value (probability of observing the data, or more extreme data, if the null hypothesis were true) is below a predetermined threshold (e.g., alpha = 0.05). This suggests the observed effect is unlikely due to random chance alone.
Key concepts:
Operational definitions of the independent (predictor) and dependent (outcome) variables are necessary to test hypotheses.
Correlational studies examine relations between naturally occurring variables but cannot establish causality. They identify associations.
Experiments test causal relations by manipulating an independent variable and randomly assigning participants to conditions, which helps control confounds.
Correlational vs experimental goals:
Correlational studies describe associations and identify potential relationships.
Experiments allow causal inference when random assignment ensures equivalent groups on average.
Between-Groups vs Within-Groups Designs
Between-groups design:
Participants experience one and only one level of the independent variable (e.g., wellness access vs no access).
Example: random assignment of employees to have access to a wellness program vs not.
Advantage: Avoids carryover effects from experiencing multiple conditions.
Within-groups design (repeated-measures):
The same participants experience multiple levels of the independent variable (e.g., pre/post measures, or exposure to different video speeds).
Advantage: controls for between-subject variability because each participant serves as their own control, increasing statistical power; disadvantage: potential carryover effects (e.g., practice, fatigue) and confounds if not properly counterbalanced (ordering of conditions).
Practical notes:
Longitudinal designs are often within-groups (participants followed over time).
Practical constraints may force correlational designs when random assignment or manipulation is unethical or impractical (e.g., hurricane exposure, smoking, or demographic variables like gender).
Ethics in data collection and randomization are central to designing valid studies.
Data Ethics, Open Science, and preregistration
Data ethics: principles guiding data collection, analysis, interpretation, and reporting to reduce bias and errors and promote transparency.
Open science: a broader movement toward sharing materials, data, and analysis plans to allow replication and critique.
Benefits: Enhances the credibility and reproducibility of research findings, fosters collaboration, and accelerates scientific discovery.
Preregistration: committing to a study design and analysis plan before collecting data; time-stamped record helps prevent HARKing (Hypothesizing After the Results are Known).
Mechanism: By specifying hypotheses, sample size, measures, and analysis plan beforehand, preregistration distinguishes confirmatory research (testing pre-specified hypotheses) from exploratory research (generating new hypotheses).
HARKing analogy: the Texas sharpshooter parable describes how retargeting hypotheses after seeing results can inflate false-positive rates; preregistration mitigates this issue.
Severe testing: a concept from philosophy of science (Karl Popper; Deborah Mayo) describing exposing hypotheses to rigorous tests to reveal weaknesses; helps improve robustness and credibility of findings.
Data ethics in practice:
Many educational examples discuss how studies may be preregistered, and how to avoid questionable practices like selective reporting or flexible analysis paths.
Open data and transparent reporting reduce replication failures and questionable research practices.
Reliability, Validity, and Measurement Quality
Reliability: consistency of a measurement over time or across raters.
Validity: whether a measurement actually measures what it claims to measure.
Helpful analogies:
A bathroom scale that shows the same weight over time is reliable but not necessarily valid if it’s systematically off from true weight.
A “What Dog Breed Are You?” quiz could be reliable (similar results across trials) but not valid as a personality measure.
Practical takeaway: A good instrument should be both reliable and valid to support trustworthy conclusions.
Operationalizing Variables: Examples from the Transcript
Operationalizing investor/artist popularity (Table examples):
“Artist popularity” could be operationalized via Billboard album sales, concert attendance, TikTok followers, etc.
Operationalizing earnings for comedians (Forbes example):
Operational definition used: pretax gross income; primary income source must come from concert ticket sales.
Critics argue the caveat is narrow; broader definitions could change rankings (e.g., Ellen DeGeneres, Mindy Kaling examples).
Operationalizing the predictor/outcome in various contexts (examples provided in the transcript):
The predictor variables for loneliness could include pet ownership (none vs at least one) and social activity (never vs at least once).
For the wine-rating study, predictor: weather (temperature, rainfall); outcome: expert ratings (discrete scale).
Reports of charities and donation metrics:
Charity Navigator vs GiveWell: Charity Navigator uses financial metrics and accountability; GiveWell uses broader criteria including impact and cost-effectiveness.
Tier vs score: tier is ordinal; score (57.11/70) is a scale measure.
Real-World Examples and Case Studies (From Transcript)
Kentucky Derby variable types (1.34):
Finishing position: ordinal (1st, 2nd, 3rd)
Finishing time: interval/ratio (seconds with decimals) — scale variable
Derby attendance: count/number (ratio, but could be treated as scale)
Payoffs per 2 bet: numerical; scale (continuous)
Jockey demographics (race/gender): nominal categories; discrete
Jockey demographics across venues (PSYC 2101 context): potential socio-demographic research
1.26 A study of average distance walked by 130 urban Indians per week:
a) Sample size: 130
b) Population: all urban residents in India
c) Is the average descriptive or inferential? Inferential if used to generalize beyond the 130.
d) Operationalizing average distance weekly as ordinal: e.g., categorize into bins (0–1 km, 1–3 km, etc.).
e) Operationalizing as scale: measure exact distance in kilometers.
HARKing and preregistration (1.45, 1.19, 1.18):
HARKing explained as modifying hypotheses after results are known to fit results; preregistration helps prevent this.
Open science and data mishaps (1.44):
Cases like Wansink scandal illustrate the importance of transparency, data sharing, and robust data practices.
RapMetrics and rhyme density (1.38):
Operationalizing rapper quality via rhyme density: RD = N_rhymes / N_syllables; examples for Eminem, Notorious B.I.G., MF Doom, Cam’ron.
Ranking is ordinal; rhyme density is a continuous variable.
Experimental vs correlational research with vaping and health (1.43):
Correlational: observing vaping and health outcomes without random assignment; confounds like other health behaviors.
Experimental: possible to assign vaping exposure to test causal effects (ethics aside, often not feasible).
The Wellspring of data ethics and operator definitions (1.9, 1.10, 1.12):
Operational definitions are central to testing hypotheses; predictor vs outcome definitions clarify what is being tested.
Typical Questions and Concepts (Selected Q&A from the Transcript)
Difference between descriptive vs inferential statistics:
Descriptive: summarize data from a sample.
Inferential: use sample data to infer about a population.
Difference between a sample and a population:
Sample: the observed subset; population: the entire group of interest.
What is preregistration?
A time-stamped plan detailing research design and analysis before data collection.
What is HARKing?
Hypothesizing after the results are known; preregistration helps prevent.
Reliability and validity distinctions:
Reliability: consistency; validity: accuracy to what is intended.
Predictor vs confounding variable decisions:
Predictor is what you test to see if it predicts the outcome; confounds threaten causal interpretation unless controlled.
Connections to Foundational Principles and Real-World Relevance
Causality vs correlation remains a central concern in policy-making and workplace health programs.
Open science and preregistration strengthen credibility of findings used to inform practice and policy.
Operationalization matters: the way a variable is defined directly affects what relationships we observe and how we interpret them.
Ethical considerations in data collection, reporting, and replication shape the integrity of psychological science.
Real-world applications include evaluating wellness programs, charitable give/get metrics, and consumer behavior analytics; these rely on rigorous design to distinguish causation from association.
Quick References to Key Terms (Glossary Snippets)
Predictor variable (independent variable): manipulated or observed to predict an outcome.
Outcome variable (dependent variable): the response measured in relation to the predictor.
Confounding variable: varies with the predictor and may influence the outcome, threatening causal conclusions.
Descriptive statistics: summarize sample data.
Inferential statistics: generalize from sample to population.
Reliability: consistency of a measurement.
Validity: accuracy of a measurement in measuring what it intends to measure.
Operational definition: concrete procedures to measure or manipulate a variable.
Between-groups design: different participants at each level of the IV.
Within-groups design: same participants experience multiple levels of the IV.
Hypothesis testing: evaluating whether observed data support a hypothesis.
Preregistration: pre-specifying hypotheses and analysis plans before data collection.
HARKing: hypothesizing after results are known.
Severe testing: rigorous evaluation of a hypothesis to reveal weaknesses.
Open science: sharing data, methods, and analyses to enable replication.
Scale variable: another term for interval/ratio variables; continuous by nature.
Ordinal variable: ranked, discrete values.
Nominal variable: categorized names without implied order.
Interval vs ratio nuance: interval has equal distances but no true zero; ratio has a meaningful zero.
Summary Takeaways
Understanding variable types (nominal, ordinal, interval, ratio) and their discrete/continuous nature is foundational for choosing analyses.
Distinguish predictor (independent) vs outcome (dependent) variables; guard against confounding variables, ideally via random assignment.
Experimental designs (between- and within-groups) are preferred for causal claims; correlational designs reveal relationships but not causality.
Reliability and validity are essential for trustworthy measurement; a reliable measure can be invalid, but valid measures must be reliable.
Data ethics and open science practices, including preregistration and severe testing, improve replicability