PSY201: Introduction to Quantitative Research in Psychology 1
Instructor and Course Overview
Course: PSY201: Introduction to Quantitative Research in Psychology (Lecture: Mondays 9–11, Tutorials: Tue/Wed)
Instructor: Prof. Keisuke Fukuda (Office hour: Mondays 11–12 @ CCT4067)
Lecture times and locations: In person lectures in CC 1080; Tutorials in various rooms (CC 2160, CC 2140, CC 1080 occasionally)
Make sure to attend the practical you registered for!
Syllabus and Course Structure
Section codes and weekly schedules:
LEC0101: Monday, 9:00–11:00, In Person: CC 1080
PRA0101–PRA0112: various Tuesday/Wednesday slots in CC 2160, CC 2140, CC 2140, CC 2160
Attendance and registration alignment:
Attend the practical you registered for; switch not advised without updating registration
Textbook Options and Purchase
Textbook options:
Print ($199.95): Introduction to Statistics and Data Analysis, 7th Edition, Roxy Peck/Chris Olsen, ISBN: 9798214000008
eBook ($76.95): Introduction to Statistics and Data Analysis, 7th Edition, Roxy Peck/Chris Olsen, ISBN: 9798214000152
Purchase through UTM bookstore strongly recommended; alternative methods not guaranteed to be supported
Link to adoption results: https://www.uoftbookstore.com/adoption-search-results? ccid=4863629&itemid=354166
Assessments and Grading (Course Evaluations)
Term test: 30% of final grade
90 minutes, on Oct 20th during lecture
Multiple choice
Allowed: hand-held non-programmable calculator and a 1-page, double-sided, Letter-size test aid
Final Exam: 40% of final grade
120 minutes during Exam period
Multiple choice
Allowed: same calculator and test aid as above
Written assignments: 10%
Due 11:59 PM on Dec 2nd via Quercus
Draft feedback by Nov 15th 11:59 PM via Quercus
Tutorial participation and completion: 18%
Attendance and successful completion of worksheet (submitted by Friday 11:59 PM after each tutorial through Quercus) required for full marks
You may miss 1 of 9 tutorials without losing a mark
SONA Experiment participation: 2% (for 3 credits = 3 hours)
3 hours of psychology experiments via SONA; deadline June 17th
Create participant account and enroll in PSY201_2025F on SONA
If you are late (no later than 10 minutes after appointment), penalty of -1 credit
Opt-out substitutes: 1 assignment = 1 credit; deadline June 2nd; link: https://www.utm.utoronto.ca/psychology/faculty-research/experiment-database- overview/substitute-assignments-experimental-credit
Why Statistics? – Core Rationale
For informed consumers and producers of information:
Informed consumer capabilities:
Extract information accurately from visualized data (tables, graphs, etc.)
Evaluate numerical arguments
Decide whether to change behavior based on information
Informed producer capabilities:
Collect data appropriately
Summarize data informatively (Descriptive statistics)
Analyze data to draw fair conclusions (Inferential statistics)
Visualize data to communicate to audiences
Consuming Information Wisely – Key Examples
Example 1: What does this information tell us?
Emphasizes interpretation of summarized data rather than taking numbers at face value
Example 2: Spurious correlations (illustrative chart)
127.0 (Bachelor’s degrees in Library science) correlates with Google searches for 'how to hide a body' (illustrative; not causal)
Other numbers: 115.5, 104.0, 92, 83, 71, 60, 48, 81, 2021, 36 (illustrative scale of search trends over years 2012–2021)
Source: National Center for Education Statistics; Google Trends; Tyler Vigen spurious correlations
Takeaway: Correlation does not imply causation; beware misleading interpretations
Example 3: Dangerous DHMO (Demo of persuasive yet misleading claims)
Claims like erosion %, tumor presence, death risk, nuclear plants usage, animal experiments, processed foods
Emphasizes checking sources, context, and evidence before accepting claims
Exercise 1 – Water Quality Example and Decision Making
Problem context: Five water specimens sampled daily; compute the average concentration for each day; a histogram summarizes the 200-day distribution of daily averages
After a spill: One month later, five specimens from the same well show an average of
14.5 ppm
Question: Considering pre-spill variation, is this convincing evidence that the well water was affected?
Implications: Interpretation depends on baseline variability, sampling variability, and threshold for detecting a spill
What does this information tell us? (Encourages critical thinking about evidence strength and uncertainty)
Population vs Sample; Descriptive vs Inferential Statistics
Population: The entire collection of individuals or objects of interest
Sample: A subset of the population from which information is collected
Descriptive statistics: Methods to organize and summarize data
Inferential statistics: Methods to make inferences about the population from the sample
Data: A collection of observations on one or more variables
Variable: A characteristic that can change in value across observations
Example: Height of humans
Data Types and Datasets
Numerical data: Observations that are numerical (e.g., heights, test scores)
Categorical (qualitative) data: Observations that are categories (e.g., gender, color)
Univariate data set: Observations vary in one characteristic
Multivariate data set: Observations vary in multiple characteristics
Example: Height of humans
Types of Numerical Variables
Discrete numerical variable: Possible values are isolated and limited points on the number line
Example: Countable events (e.g., number of items)
Continuous numerical variable: Possible values can be anywhere on the number line (theoretically infinite)
Example: Measurements like height, weight
Frequency Distributions for Categorical Data
Frequency: Number of times a category occurs in the data
Relative frequency: Proportion of observations in a category
Example uses: demonstration with shark attack data; context described for understanding distribution
How Do We Collect Data? Observational Study
Observational study: Observe characteristics of a sample drawn from one or more existing populations
Purpose: Draw conclusions about the population or differences between populations regarding the characteristics
Example study question: “The internet usage across young (21–40) adults?”
Methodology: 1000 individuals (gender-balanced) from age 21–40 answer: “Do you use internet everyday?”
Reported percentages by age group (Young 21–40, Middle 41–60, Old 61–80)
How to Collect Data Sensibly? Avoiding Bias
Bias types:
Selection (sampling) bias: Systematic exclusion of part of the population
Measurement (response) bias: Measurement methodology affects results
Nonresponse bias: Nonparticipation affects representativeness
Source: https://thedailyjaws.com/news/florida-is-the-shark-bite-capital-of-the-world
Image credit: https://www.flickr.com/photos/61056899@N06/
Random Sampling and Practicality
Recommended approach: Simple random sampling
Simple random sample of size n: Every possible sample of size n has the same chance of being selected
Sampling without replacement: Once selected, an individual cannot be selected again
Sampling with replacement: An individual can be selected multiple times
Practical note: When sample size n is less than 10% of the population, both sampling approaches are practically equivalent
Visual aid: Sample (n = 4) vs Population; X marks represent sample drawn from population
How Much Data is Enough? – Sample Size Guidance
Example: Population distribution of MATH SAT scores of all applicants (n = 5000)
Question: How big should the random sample be to know reliably about the population?
Answer: With random sampling, 1% (50/5000) sample size can tell us reliably about the population
Caveat: Simple random sampling is often difficult and costly in practice
Practical Sampling Alternatives
Alternative sampling methodologies:
Stratified Random Sampling: Divide population into non-overlapping strata, sample within each stratum proportionally to its size
Cluster Sampling: Randomly sample at cluster/group level rather than individuals
Systematic Sampling: Select a random first element, then sample every kth element
Convenience Sampling: Sample based on ease of access
Practical caution: Convenience sampling is common but generalization to the population must be done with extreme care due to potential bias
Experimental Study – Core Concepts
Experimental study: One or more explanatory variables are manipulated to observe effects on a response variable
Explanatory variables: Also called independent variables or factors; values controlled or manipulated by the researcher
Response variables: Also called dependent variables; measured but not controlled by the researcher; hypothesized to be affected by explanatory variables
Experimental conditions: Combinations of values for the explanatory variables (also called treatments)
Extraneous variables: Not explanatory but can affect the response variables
Good experimental design aims to ensure that only explanatory variables explain observed effects; extraneous variables are controlled or accounted for
Four Pillars of Good Experimental Design
Direct control: Hold extraneous variables constant across conditions
Random assignment: Randomly assign subjects to conditions to balance extraneous factors
Blocking: Use extraneous variables to create groups that are evenly assigned across conditions
Replication: Repeat the experiment to ensure results are reliable and not due to idiosyncrasies of a single data set
Start from here: these principles are foundational to designing robust experiments
Experimental Study Example – Does a 'Thank you' on the check increase tips?
Explanatory variable: Presence of a 'Thank you' note on the check
Experimental conditions (treatments): 'Thank you' vs No 'Thank you'
Response variable: Percentage of tips left by customers
Participants: 200 customers during shifts (Thursday and Friday)
Assignment options:
A: 'Thank you' on Thursday, No 'Thank you' on Friday
B: 'Thank you' on Friday, No 'Thank you' on Thursday
C: Half get 'Thank you' and half do not on each day
D: For each participant, flip a coin to assign treatment
Important design note: Day of week could confound with the treatment effect; blocking by day could help mitigate confounding
Real-world constraint: Random sampling is often hard; random assignment within practical constraints allows assessment of the treatment effect
Blocking reference: Conceptually similar to stratified sampling
Organizing Experimental Design – Flow and Practical Tips
Visualize design with a flow chart to organize steps and ensure clarity
Acknowledge real-world complexities in randomization; plan for blocking and randomization where possible
Quick References and Takeaways
Always consider threats to validity (bias, confounding variables, measurement error)
Understand the distinction between population and sample, descriptive vs inferential statistics
Recognize different data types and the appropriate analyses for each
Plan data collection with bias reduction and practical constraints in mind
Use experimental design principles (control, randomization, blocking, replication) to strengthen causal claims