TF

PSY201: Introduction to Quantitative Research in Psychology 1

Instructor and Course Overview

  • Course: PSY201: Introduction to Quantitative Research in Psychology (Lecture: Mondays 9–11, Tutorials: Tue/Wed)

  • Instructor: Prof. Keisuke Fukuda (Office hour: Mondays 11–12 @ CCT4067)

  • Lecture times and locations: In person lectures in CC 1080; Tutorials in various rooms (CC 2160, CC 2140, CC 1080 occasionally)

  • Make sure to attend the practical you registered for!

Syllabus and Course Structure

  • Section codes and weekly schedules:

    • LEC0101: Monday, 9:00–11:00, In Person: CC 1080

    • PRA0101–PRA0112: various Tuesday/Wednesday slots in CC 2160, CC 2140, CC 2140, CC 2160

  • Attendance and registration alignment:

    • Attend the practical you registered for; switch not advised without updating registration

Textbook Options and Purchase

  • Textbook options:

    • Print ($199.95): Introduction to Statistics and Data Analysis, 7th Edition, Roxy Peck/Chris Olsen, ISBN: 9798214000008

    • eBook ($76.95): Introduction to Statistics and Data Analysis, 7th Edition, Roxy Peck/Chris Olsen, ISBN: 9798214000152

  • Purchase through UTM bookstore strongly recommended; alternative methods not guaranteed to be supported

  • Link to adoption results: https://www.uoftbookstore.com/adoption-search-results? ccid=4863629&itemid=354166

Assessments and Grading (Course Evaluations)

  • Term test: 30% of final grade

    • 90 minutes, on Oct 20th during lecture

    • Multiple choice

    • Allowed: hand-held non-programmable calculator and a 1-page, double-sided, Letter-size test aid

  • Final Exam: 40% of final grade

    • 120 minutes during Exam period

    • Multiple choice

    • Allowed: same calculator and test aid as above

  • Written assignments: 10%

    • Due 11:59 PM on Dec 2nd via Quercus

    • Draft feedback by Nov 15th 11:59 PM via Quercus

  • Tutorial participation and completion: 18%

    • Attendance and successful completion of worksheet (submitted by Friday 11:59 PM after each tutorial through Quercus) required for full marks

    • You may miss 1 of 9 tutorials without losing a mark

  • SONA Experiment participation: 2% (for 3 credits = 3 hours)

    • 3 hours of psychology experiments via SONA; deadline June 17th

    • Create participant account and enroll in PSY201_2025F on SONA

    • If you are late (no later than 10 minutes after appointment), penalty of -1 credit

    • Opt-out substitutes: 1 assignment = 1 credit; deadline June 2nd; link: https://www.utm.utoronto.ca/psychology/faculty-research/experiment-database- overview/substitute-assignments-experimental-credit

Why Statistics? – Core Rationale

  • For informed consumers and producers of information:

    • Informed consumer capabilities:

    • Extract information accurately from visualized data (tables, graphs, etc.)

    • Evaluate numerical arguments

    • Decide whether to change behavior based on information

    • Informed producer capabilities:

    • Collect data appropriately

    • Summarize data informatively (Descriptive statistics)

    • Analyze data to draw fair conclusions (Inferential statistics)

    • Visualize data to communicate to audiences

Consuming Information Wisely – Key Examples

  • Example 1: What does this information tell us?

    • Emphasizes interpretation of summarized data rather than taking numbers at face value

  • Example 2: Spurious correlations (illustrative chart)

    • 127.0 (Bachelor’s degrees in Library science) correlates with Google searches for 'how to hide a body' (illustrative; not causal)

    • Other numbers: 115.5, 104.0, 92, 83, 71, 60, 48, 81, 2021, 36 (illustrative scale of search trends over years 2012–2021)

    • Source: National Center for Education Statistics; Google Trends; Tyler Vigen spurious correlations

    • Takeaway: Correlation does not imply causation; beware misleading interpretations

  • Example 3: Dangerous DHMO (Demo of persuasive yet misleading claims)

    • Claims like erosion %, tumor presence, death risk, nuclear plants usage, animal experiments, processed foods

    • Emphasizes checking sources, context, and evidence before accepting claims

Exercise 1 – Water Quality Example and Decision Making

  • Problem context: Five water specimens sampled daily; compute the average concentration for each day; a histogram summarizes the 200-day distribution of daily averages

  • After a spill: One month later, five specimens from the same well show an average of

    • 14.5 ppm

  • Question: Considering pre-spill variation, is this convincing evidence that the well water was affected?

  • Implications: Interpretation depends on baseline variability, sampling variability, and threshold for detecting a spill

  • What does this information tell us? (Encourages critical thinking about evidence strength and uncertainty)

Population vs Sample; Descriptive vs Inferential Statistics

  • Population: The entire collection of individuals or objects of interest

  • Sample: A subset of the population from which information is collected

  • Descriptive statistics: Methods to organize and summarize data

  • Inferential statistics: Methods to make inferences about the population from the sample

  • Data: A collection of observations on one or more variables

  • Variable: A characteristic that can change in value across observations

  • Example: Height of humans

Data Types and Datasets

  • Numerical data: Observations that are numerical (e.g., heights, test scores)

  • Categorical (qualitative) data: Observations that are categories (e.g., gender, color)

  • Univariate data set: Observations vary in one characteristic

  • Multivariate data set: Observations vary in multiple characteristics

  • Example: Height of humans

Types of Numerical Variables

  • Discrete numerical variable: Possible values are isolated and limited points on the number line

    • Example: Countable events (e.g., number of items)

  • Continuous numerical variable: Possible values can be anywhere on the number line (theoretically infinite)

    • Example: Measurements like height, weight

Frequency Distributions for Categorical Data

  • Frequency: Number of times a category occurs in the data

  • Relative frequency: Proportion of observations in a category

  • Example uses: demonstration with shark attack data; context described for understanding distribution

How Do We Collect Data? Observational Study

  • Observational study: Observe characteristics of a sample drawn from one or more existing populations

  • Purpose: Draw conclusions about the population or differences between populations regarding the characteristics

  • Example study question: “The internet usage across young (21–40) adults?”

  • Methodology: 1000 individuals (gender-balanced) from age 21–40 answer: “Do you use internet everyday?”

    • Reported percentages by age group (Young 21–40, Middle 41–60, Old 61–80)

How to Collect Data Sensibly? Avoiding Bias

  • Bias types:

    • Selection (sampling) bias: Systematic exclusion of part of the population

    • Measurement (response) bias: Measurement methodology affects results

    • Nonresponse bias: Nonparticipation affects representativeness

  • Source: https://thedailyjaws.com/news/florida-is-the-shark-bite-capital-of-the-world

  • Image credit: https://www.flickr.com/photos/61056899@N06/

Random Sampling and Practicality

  • Recommended approach: Simple random sampling

  • Simple random sample of size n: Every possible sample of size n has the same chance of being selected

  • Sampling without replacement: Once selected, an individual cannot be selected again

  • Sampling with replacement: An individual can be selected multiple times

  • Practical note: When sample size n is less than 10% of the population, both sampling approaches are practically equivalent

  • Visual aid: Sample (n = 4) vs Population; X marks represent sample drawn from population

How Much Data is Enough? – Sample Size Guidance

  • Example: Population distribution of MATH SAT scores of all applicants (n = 5000)

  • Question: How big should the random sample be to know reliably about the population?

  • Answer: With random sampling, 1% (50/5000) sample size can tell us reliably about the population

  • Caveat: Simple random sampling is often difficult and costly in practice

Practical Sampling Alternatives

  • Alternative sampling methodologies:

    • Stratified Random Sampling: Divide population into non-overlapping strata, sample within each stratum proportionally to its size

    • Cluster Sampling: Randomly sample at cluster/group level rather than individuals

    • Systematic Sampling: Select a random first element, then sample every kth element

    • Convenience Sampling: Sample based on ease of access

  • Practical caution: Convenience sampling is common but generalization to the population must be done with extreme care due to potential bias

Experimental Study – Core Concepts

  • Experimental study: One or more explanatory variables are manipulated to observe effects on a response variable

  • Explanatory variables: Also called independent variables or factors; values controlled or manipulated by the researcher

  • Response variables: Also called dependent variables; measured but not controlled by the researcher; hypothesized to be affected by explanatory variables

  • Experimental conditions: Combinations of values for the explanatory variables (also called treatments)

  • Extraneous variables: Not explanatory but can affect the response variables

  • Good experimental design aims to ensure that only explanatory variables explain observed effects; extraneous variables are controlled or accounted for

Four Pillars of Good Experimental Design

  • Direct control: Hold extraneous variables constant across conditions

  • Random assignment: Randomly assign subjects to conditions to balance extraneous factors

  • Blocking: Use extraneous variables to create groups that are evenly assigned across conditions

  • Replication: Repeat the experiment to ensure results are reliable and not due to idiosyncrasies of a single data set

  • Start from here: these principles are foundational to designing robust experiments

Experimental Study Example – Does a 'Thank you' on the check increase tips?

  • Explanatory variable: Presence of a 'Thank you' note on the check

  • Experimental conditions (treatments): 'Thank you' vs No 'Thank you'

  • Response variable: Percentage of tips left by customers

  • Participants: 200 customers during shifts (Thursday and Friday)

  • Assignment options:

    • A: 'Thank you' on Thursday, No 'Thank you' on Friday

    • B: 'Thank you' on Friday, No 'Thank you' on Thursday

    • C: Half get 'Thank you' and half do not on each day

    • D: For each participant, flip a coin to assign treatment

  • Important design note: Day of week could confound with the treatment effect; blocking by day could help mitigate confounding

  • Real-world constraint: Random sampling is often hard; random assignment within practical constraints allows assessment of the treatment effect

  • Blocking reference: Conceptually similar to stratified sampling

Organizing Experimental Design – Flow and Practical Tips

  • Visualize design with a flow chart to organize steps and ensure clarity

  • Acknowledge real-world complexities in randomization; plan for blocking and randomization where possible

Quick References and Takeaways

  • Always consider threats to validity (bias, confounding variables, measurement error)

  • Understand the distinction between population and sample, descriptive vs inferential statistics

  • Recognize different data types and the appropriate analyses for each

  • Plan data collection with bias reduction and practical constraints in mind

  • Use experimental design principles (control, randomization, blocking, replication) to strengthen causal claims