Descriptive Statistics & Research Methods – Lecture 1 Vocabulary

Introduction & Rationale for Statistics in Psychology

Lecturer: Dr William Coventry (first of two statistics lectures in PSYC 01/2002)
Core message
- Psychology prides itself on being a “hard science” because of its rigorous statistical methodology.
- Mastery of statistics is indispensable for honours projects, theses, and any research career.
- Although stats may feel intimidating, psychological statistics are usually “not too mathematical.”
Practical advice
- “Own it and get on top of it.” Early competence pays dividends later.
- Textbook study is still required; lecture covers only selected chapters.

Lecture Roadmap (Descriptive Statistics Only)

Distinction: Descriptive vs Inferential Statistics (inferential to be covered next lecture)
Topics today
1. Quantitative vs Qualitative methods
2. Experimental research design basics
3. Statistical tools: t-tests and correlations
4. Fundamental descriptive measures (central tendency, variability, graphs, effect sizes)

Quantitative vs Qualitative Methods

Quantitative
- Numerical data, statistical analysis
- Primary focus in undergraduate psychology; unique among many social-science disciplines
Qualitative
- Narrative, thematic, non-numerical
- Valuable but less emphasised in core psych stats units
Real-world note: Multidisciplinary labs often hire psych graduates for their superior statistical training.

Experimental Research Design (Randomised Controlled Trials)

Purpose: Establish cause → effect
Structure
- Multiple conditions: ≥1 experimental vs ≥1 control group
- Random assignment to minimise pre-existing group differences
- Manipulation applied only to experimental group
- Pre- and post-testing of dependent variables
Terminology
- Independent Variable (IV) = manipulated or grouping factor
- Dependent Variable (DV) = outcome that “depends on” IV
Ethical / practical note: Sometimes RCTs are impossible (e.g., withholding a life-saving drug); alternatives include observational or correlational designs.

Independent vs Dependent Variables (Metaphor)

Fertiliser & Plants example
- IV: Application of fertiliser (yes/no)
- DV: Plant growth (height, biomass)
- DV value “depends on” the IV manipulation

Example Study: “IQ Is a Muscle” Intervention

Research Question: Does telling children that intelligence is malleable improve mathematics performance?
Groups
- Experimental: Received sessions framing IQ as a “muscle that gets stronger with use.”
- Control: Received sessions on memory/academic topics (equal contact time; no growth-mindset content).
DV: Math grades (continuous)
IV: Group membership (categorical: experimental vs control)
Result (single-time-point view)
- Experimental group’s mean math score > Control’s mean math score
- “Nothing more to it”—illustrates simplest use of t-test

Importance of Multiple Time-Points & Baseline Equivalence

Actual study recorded 3 time points (pre-intervention, immediate post, later post)
- Baseline showed near-equal math scores (though experimental slightly higher—weakens causal claim slightly)
- Post-intervention divergence supports effect of manipulation
Take-home: Pre-test measures allow stronger inference of causality.
Additional follow-up (Time 4) would reveal whether effects persist.

Variable Types & Their Statistical Consequences

Categorical (Discrete) Variables
- Distinct groups/levels (e.g., male/female/non-binary; pass/fail/credit/distinction/HD)
- For t-tests, ideal when variable has exactly 2 levels
Continuous Variables
- Numeric continuum (e.g., 0!\text{–}!100 exam mark, height in cm)
Golden rule for beginners
- Two continuous vars → Correlation
- One categorical (2-level) + one continuous → t-test

Statistical Tool 1: t-Tests

Purpose: Test whether the means of two independent groups differ
Inputs
- IV: Categorical (2 groups)
- DV: Continuous
Calculation considers
1. Difference of group means (\bar X1 - \bar X2)
2. Spread around those means (pooled variance s^2_p)
3. Formula (simplified): t = \dfrac{\bar X1 - \bar X2}{SE} where SE = standard error
Interpretation
- Larger mean gap + smaller within-group variance ⇒ larger |t| (more “impressive”)
- Visual cue: Less overlap of score distributions strengthens result

Statistical Tool 2: Correlation (r)

Definition: Single number summarising linear association between two continuous variables
Range: -1 \le r \le 1
- r = 0 → no linear relation
- r = +1 → perfect positive; r = -1 → perfect negative
Strength (|r|)
- Rough guidelines: |r| ≈ 0.1 weak, 0.3 moderate, \ge 0.5 strong
Direction
- Positive: Variables move together (↑ drinks → ↑ hangover severity)
- Negative: Variables move opposite (↑ drinks → ↓ driving ability)
Each scatter-plot dot = one participant (paired scores on X & Y axes)
Effect size role: Correlation itself is an effect-size metric.

Worked Examples for Correlation

Drinks vs Hangover Severity
- Positive correlation; more drinks, worse hangover.
Drinks vs Driving Skill
- Negative correlation; more drinks, poorer driving.
Vitamin D vs COVID-19 Severity
- Hypothesis predicted negative correlation (high vitamin D → mild COVID).
- Empirical finding: r \approx -0.06 (statistically tiny, “miserable”).
- RCT meta-analysis likewise shows no meaningful protective effect of vitamin D supplementation.

Descriptive Statistics: The Big Picture

Aim: Summarise data before any inferential claims
Core measures
- Central tendency: Mean, median, mode
- Variability: Range, variance (s^2), standard deviation (s)
- Effect sizes: Correlation (r), Cohen’s d, etc.
- Graphs: Histograms, scatter-plots, box-plots
Philosophical note
- Over-complex statistics can obscure insights; clear graphics often suffice (Gerd Gigerenzer).
- Psychology remains a “hard science” even when using simple descriptive visuals.

Ethical & Philosophical Implications Discussed

RCT feasibility & ethics (e.g., denying a potentially life-saving treatment is unethical; parallels with smoking & lung cancer research)
Proper use of p-values; misuse leads to misunderstandings—topic for next lecture
Encouragement to balance statistical sophistication with transparent data storytelling

Connections to Future Content

Next lecture = Inferential Statistics (p-values, significance tests, deeper use of t-tests & correlations)
Understanding today’s foundations makes later concepts (e.g., ANOVA, regression) much easier.

Key Takeaways & Study Tips

Master t-tests & correlations first; they are “everywhere” in psychological science.
Always identify variable type (categorical vs continuous) before choosing a test.
Use multiple pre/post measurements to bolster causal inference in experiments.
Remember that effect size and practical significance matter as much as (or more than) p-values.
Reinforce learning via external tutorials and practise interpreting scatter-plots & group means.

Quick Formula Reference

t-test (independent): t = \dfrac{\bar X1 - \bar X2}{\sqrt{\dfrac{s1^2}{n1} + \dfrac{s2^2}{n2}}}
Pearson correlation: r = \dfrac{\sum (Xi - \bar X)(Yi - \bar Y)}{\sqrt{\sum (Xi - \bar X)^2 \; \sum (Yi - \bar Y)^2}}

Closing Remarks

Descriptive stats are the foundation; inferential stats build on them.
Upcoming session will “trample” misconceptions about p-values and significance.
Until then, focus on understanding basic group comparisons, variable types, and interpreting scatter-plots.