Descriptive Statistics & Research Methods – Lecture 1 Vocabulary
Introduction & Rationale for Statistics in Psychology
- Lecturer: Dr William Coventry (first of two statistics lectures in PSYC 01/2002)
- Core message
- Psychology prides itself on being a “hard science” because of its rigorous statistical methodology.
- Mastery of statistics is indispensable for honours projects, theses, and any research career.
- Although stats may feel intimidating, psychological statistics are usually “not too mathematical.”
- Practical advice
- “Own it and get on top of it.” Early competence pays dividends later.
- Textbook study is still required; lecture covers only selected chapters.
Lecture Roadmap (Descriptive Statistics Only)
- Distinction: Descriptive vs Inferential Statistics (inferential to be covered next lecture)
- Topics today
- Quantitative vs Qualitative methods
- Experimental research design basics
- Statistical tools: t-tests and correlations
- Fundamental descriptive measures (central tendency, variability, graphs, effect sizes)
Quantitative vs Qualitative Methods
- Quantitative
- Numerical data, statistical analysis
- Primary focus in undergraduate psychology; unique among many social-science disciplines
- Qualitative
- Narrative, thematic, non-numerical
- Valuable but less emphasised in core psych stats units
- Real-world note: Multidisciplinary labs often hire psych graduates for their superior statistical training.
Experimental Research Design (Randomised Controlled Trials)
- Purpose: Establish cause → effect
- Structure
- Multiple conditions: ≥1 experimental vs ≥1 control group
- Random assignment to minimise pre-existing group differences
- Manipulation applied only to experimental group
- Pre- and post-testing of dependent variables
- Terminology
- Independent Variable (IV) = manipulated or grouping factor
- Dependent Variable (DV) = outcome that “depends on” IV
- Ethical / practical note: Sometimes RCTs are impossible (e.g., withholding a life-saving drug); alternatives include observational or correlational designs.
- Fertiliser & Plants example
- IV: Application of fertiliser (yes/no)
- DV: Plant growth (height, biomass)
- DV value “depends on” the IV manipulation
Example Study: “IQ Is a Muscle” Intervention
- Research Question: Does telling children that intelligence is malleable improve mathematics performance?
- Groups
- Experimental: Received sessions framing IQ as a “muscle that gets stronger with use.”
- Control: Received sessions on memory/academic topics (equal contact time; no growth-mindset content).
- DV: Math grades (continuous)
- IV: Group membership (categorical: experimental vs control)
- Result (single-time-point view)
- Experimental group’s mean math score > Control’s mean math score
- “Nothing more to it”—illustrates simplest use of t-test
Importance of Multiple Time-Points & Baseline Equivalence
- Actual study recorded 3 time points (pre-intervention, immediate post, later post)
- Baseline showed near-equal math scores (though experimental slightly higher—weakens causal claim slightly)
- Post-intervention divergence supports effect of manipulation
- Take-home: Pre-test measures allow stronger inference of causality.
- Additional follow-up (Time 4) would reveal whether effects persist.
Variable Types & Their Statistical Consequences
- Categorical (Discrete) Variables
- Distinct groups/levels (e.g., male/female/non-binary; pass/fail/credit/distinction/HD)
- For t-tests, ideal when variable has exactly 2 levels
- Continuous Variables
- Numeric continuum (e.g., 0!\text{–}!100 exam mark, height in cm)
- Golden rule for beginners
- Two continuous vars → Correlation
- One categorical (2-level) + one continuous → t-test
- Purpose: Test whether the means of two independent groups differ
- Inputs
- IV: Categorical (2 groups)
- DV: Continuous
- Calculation considers
- Difference of group means (\bar X1 - \bar X2)
- Spread around those means (pooled variance s^2_p)
- Formula (simplified): t = \dfrac{\bar X1 - \bar X2}{SE} where SE = standard error
- Interpretation
- Larger mean gap + smaller within-group variance ⇒ larger |t| (more “impressive”)
- Visual cue: Less overlap of score distributions strengthens result
- Definition: Single number summarising linear association between two continuous variables
- Range: -1 \le r \le 1
- r = 0 → no linear relation
- r = +1 → perfect positive; r = -1 → perfect negative
- Strength (|r|)
- Rough guidelines: |r| ≈ 0.1 weak, 0.3 moderate, \ge 0.5 strong
- Direction
- Positive: Variables move together (↑ drinks → ↑ hangover severity)
- Negative: Variables move opposite (↑ drinks → ↓ driving ability)
- Each scatter-plot dot = one participant (paired scores on X & Y axes)
- Effect size role: Correlation itself is an effect-size metric.
Worked Examples for Correlation
- Drinks vs Hangover Severity
- Positive correlation; more drinks, worse hangover.
- Drinks vs Driving Skill
- Negative correlation; more drinks, poorer driving.
- Vitamin D vs COVID-19 Severity
- Hypothesis predicted negative correlation (high vitamin D → mild COVID).
- Empirical finding: r \approx -0.06 (statistically tiny, “miserable”).
- RCT meta-analysis likewise shows no meaningful protective effect of vitamin D supplementation.
Descriptive Statistics: The Big Picture
- Aim: Summarise data before any inferential claims
- Core measures
- Central tendency: Mean, median, mode
- Variability: Range, variance (s^2), standard deviation (s)
- Effect sizes: Correlation (r), Cohen’s d, etc.
- Graphs: Histograms, scatter-plots, box-plots
- Philosophical note
- Over-complex statistics can obscure insights; clear graphics often suffice (Gerd Gigerenzer).
- Psychology remains a “hard science” even when using simple descriptive visuals.
Ethical & Philosophical Implications Discussed
- RCT feasibility & ethics (e.g., denying a potentially life-saving treatment is unethical; parallels with smoking & lung cancer research)
- Proper use of p-values; misuse leads to misunderstandings—topic for next lecture
- Encouragement to balance statistical sophistication with transparent data storytelling
Connections to Future Content
- Next lecture = Inferential Statistics (p-values, significance tests, deeper use of t-tests & correlations)
- Understanding today’s foundations makes later concepts (e.g., ANOVA, regression) much easier.
Key Takeaways & Study Tips
- Master t-tests & correlations first; they are “everywhere” in psychological science.
- Always identify variable type (categorical vs continuous) before choosing a test.
- Use multiple pre/post measurements to bolster causal inference in experiments.
- Remember that effect size and practical significance matter as much as (or more than) p-values.
- Reinforce learning via external tutorials and practise interpreting scatter-plots & group means.
- t-test (independent): t = \dfrac{\bar X1 - \bar X2}{\sqrt{\dfrac{s1^2}{n1} + \dfrac{s2^2}{n2}}}
- Pearson correlation: r = \dfrac{\sum (Xi - \bar X)(Yi - \bar Y)}{\sqrt{\sum (Xi - \bar X)^2 \; \sum (Yi - \bar Y)^2}}
- Descriptive stats are the foundation; inferential stats build on them.
- Upcoming session will “trample” misconceptions about p-values and significance.
- Until then, focus on understanding basic group comparisons, variable types, and interpreting scatter-plots.