1/58
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Association Without Causation Studies Example
Things change or happen at the same time, no clear way to establish which one led to the other/ if one caused another at all
Ex: studies have found children that watch TV are more aggressive, without more information we can’t make this causation, maybe aggressive children are more into TV, maybe parents know more aggressive children can be behaved/ monitored by sticking them to a TV, we can’t conclude TV makes children more aggressive
Ex: going to bed with shoes on is associated with waking up with a headache - this could be linked to going to sleep hungover with shoes on
Ex: More Fire engines dispatched to the place the fire is more destructive, ex: we don’t know for sure - we send them because there is a fire
Ex: study says eating more chocolate produces more nobel prize winners; the number of nobel laurreates is associated with annual chocolate consumption in country, for no particular reason countries produced more chocolate (association not causation)
2 things can occur at the same time, but not provide causation
Important not to jump to conclusions
Ecological Validity (examples)
Whatever we are studying replicate the real world, outcome of study mimics the real world
EX: flight simulation not the same as real life / simulation is not going to be identical to reality, simulation ecological validity not very high
Role-play in police training (don’t assume they translate into real world), police officer may perform differently with real world
Ex: how you use language question varies from how you actually use language in social settings (for ex: if you were collecting information on the language of request making, asking people how they would respond to roommate leaving a mess in the kitchen, people would say they respond with “Can you clean up the kitchen?”, in reality, people may have hesitations/ may ask less directly given their social interactions with their roommate/ friend), people may make requests pointing/ with body language.
Experimental Design Examples
Ex: think about identical plants, identical sunlight, the only manipulated variable is liquid used to water each plant (water,juice, study)
Dependent: height of the plant
Independent: type of the liquid
Experimental Design Across Groups Examples
Ex: Comparing coffees vs non coffee drinkers
Experimental Designs Across Conditions
Ex: Grouping gender and math/reading to test gender stereotypes (ACROSS CONDITIONS)
Stereotype Congruent (easy/fast) = brain jumps to these assumption, words sort faster (boy and math already sorted)
Stereotype Incongruent (difficult/ slow) = words sort slower (boy is paired with reading)
Have to choose left or right
Unit of analysis is an individual
Across conditions (same people taking same set of group)
Dependent variable: sorting speed, taken as proxy for stereotypes
Ex: punch with companion, punch without companion (measure happiness) = across conditions (same unit of analysis)
Ex: same plants one state (water) another state (orange juice)
Experimental Design
Strong in internal validity: control for cause and effect
Weak in external validity: one specific study
Weak in ecological validity: artificial set up (can be remedied by natural experiment)
STUDY BY AMADEU & DIVINE: ask whether implicit stereotyping or evaluative race bias are associated with sitting closer or further away from the belongings of someone they know is another race
ex - putting a bag down before people enter room, experiment already set up, trying to understand if people with a more stereotypical view will sit further away, people don’t know they are being observed
Cross-Sectional Design
Does money buy happiness?
all american population
how much money people make how happy they are
they questioned multiple groups of people at one point in time
Happiness: Dependent variable
Money: Independent variable
Internal Validity Weak: no control/ manipulation
External Validity Strong: surveyed a large group of people
Ecological Validity Weak(ISH): people can respond to surveys with some bias (humans bad at self-evaluation)
Longitudinal Ex:
Panel Study: Participants selected randomly (look at same people over multiple points)
Cohort study: participants have a common characteristic (born in 2000, graduated in 2025, get married on a certain day)
Looked at 4 students from disadvantaged background who were voted most likely to success, filmed over a ten year period, looked at their outcomes
Internal Validity: strong over a long period of time, you know before and after, units are the same
External Validity: weak; generalization not priority, sample attrition sample reduces, people loose motivation
Ecological Validity: weak(ish) depends on technique (observation higher, suvreys/ questionnaires) - weaker
Case Study Ex: Genie
The case of Genie - a critical case study for theories of language learning
Linguists want to know how we learn language
Theory: To learn language, you have to be exposed to it by a certain age, past this point you can’t learn language anymore
Ex of Critical/ Extreme case - feral child: Genie discovered at 13 years old, kept in a basement in a cage, no interaction, tested theory of language, she could not produce speech
Tested whether exposure to language at a certain age is crucial for language development; answer was yes through this case study
Internal Validity: very high (clear cause and effect)
External Validity: very low (not generalizable beyond study)
Ecological Validity: very high (observed real life outcomes)
Comparative Ex:
Applied to different units of analysis (exactly everything is the same, except units of analysis, establishes similarities/ differences)
Typically difference in unit of analyses is cross-cultural, cross-national, etc
Ex: study amongst working class in NY/LA
From concepts to variable ex.
Concept = Academic ambition
Indicators
▪ Taking more than 12 units per quarter
▪ Caring about grades
▪ Participating in research
▪ Double majoring
▪ Taking graduate level courses
Variable = Amount of academic ambitiousness
Moss- Racusin Et Al. Summary
This study experimentally tested whether science faculty show gender bias when evaluating equally qualified students. Faculty rated identical application materials assigned either a male or female name. Results showed systematic bias favoring male students in competence, hiring, salary, and mentoring.
Moss - Racusin Et Al. Hypothesis
Hypothesis 1: Faculty would rate male students as more competent, hireable, better paid, and more worthy of mentoring than identical female students.
Why: Prior research in social psychology shows persistent implicit gender bias and stereotypes portraying women as less competent in science, even among egalitarian (equitable) individuals. The authors extended this literature into academic science, where experimental evidence was lacking.
Hypothesis 2: Faculty gender would not influence the bias (i.e., both male and female faculty would show similar bias).
Why: Previous studies indicate that implicit biases are culturally learned and widely shared, meaning even women may internalize stereotypes about women’s competence in STEM fields.
Moss- Racusin Et Al. Type of Research Design
The research design used in this study is an experiment, in which faculty evaluated identical application materials randomly assigned male or female names, allowing researchers to test for faculty bias while controlling for gender
Manipulated Independent Variable (gender of the applicant)
Random Assignment
Control of all other variables (The application materials were identical in every way except for the name. That means any differences in evaluations can be attributed to gender, not other factors.)
Measurement of outcomes (dependent variables):
The researchers then measured things like competence ratings, hiring decisions, salary offers, and mentoring.
Validity - Moss Racusin Et. Al
Internal Validity (High)
Only one variable manipulated: applicant gender
Identical application materials across conditions
Random assignment of participants
→ Supports strong cause-and-effect conclusion (gender caused differences)
Ecological Validity (Moderate)
Real science faculty as participants
Task (evaluating applications) reflects real-world behavior
Some artificial elements (study setting, not actual hiring decision)
→ Fairly realistic, but not perfectly natural
External Validity (Moderate)
Sample includes multiple universities and STEM fields
Improves generalizability within academic science
Limited by non-random sampling and U.S.-only context
Moss Racusin Et Al. - 3 Key Results
Competence & Hireability Bias:
Faculty rated male applicants as significantly more competent and more hireable than identical female applicants.
Salary Difference:
Male applicants were offered higher starting salaries (about $30,238 vs. $26,508).
Mentoring Gap:
Faculty were more willing to provide career mentoring to male students than female students.
Moss Racusin Et Al. Variables
Independent Variable: Gender of applicant (male vs. female name)
Dependent Variable: Faculty evaluations (competence, hireability, salary, mentoring)
Sampling Method
Non-probability sampling: Faculty recruited from selected research-intensive universities
Random assignment: Participants randomly assigned to applicant gender condition
Author’s Approach to Theory
Deductive approach
The authors began with existing theories of implicit bias and tested specific hypotheses through an experiment.
Ontological orientation:
The study leans more toward objectivism, not constructionism.
Why: It treats gender bias as something that exists independently and can be measured objectively (e.g., through ratings of competence, salary).
WHO (Five) Well-Being Index
Likert scale questions for five statements which indicate for well-being asking which is closest to how you have been feeling over the last 2 weeks. Higher numbers mean better well-being
5 (all of the time/ strongly agree)
4 (most of the time/ somewhat agree)
etc.
Likert Scale
Strongly Agree (5)
Somewhat agree (4)
Uncertain (3)
Somewhat Disagree (2)
Strongly Disagree (1)
Kahneman and Deaton - High Income Improves Evaluation of life but not emotional well-being (Summary)
Summary
This study examines whether money increases happiness by distinguishing between emotional well-being (daily feelings) and life evaluation (overall life satisfaction). Using over 450,000 survey responses, the authors find that income increases life evaluation continuously, but emotional well-being only improves up to about $75,000/year, after which it plateaus
Kahneman and Deaton - High Income Improves Evaluation of life but not emotional well-being (Hypothesis)
Hypotheses (with reasoning)
Hypothesis 1: Income is more strongly related to life evaluation than to emotional well-being.
Why: Prior research suggested that income affects how people think about their lives more than how they feel day-to-day.
Hypothesis 2: Emotional well-being increases with income only up to a certain threshold, after which it levels off.
Why: Based on theories of adaptation and diminishing returns (e.g., Weber’s Law), increases in income have smaller psychological effects at higher levels.
Kahneman and Deaton - High Income Improves Evaluation of life but not emotional well-being (Research Design)
Cross sectional
Analysis of large-scale survey data (no manipulation of variables)
Kahneman and Deaton - High Income Improves Evaluation of life but not emotional well-being (Validity Assesment )
Internal Validity (Moderate–Low)
No experimental manipulation → cannot establish causation
Many confounding variables (health, relationships, etc.)
Uses statistical controls, but causal claims are limited
Ecological Validity (Low)
surveys subject to bias, surveys can not reflect reality
External Validity (High)
Very large sample size (450,000+ participants)
Nationally representative survey methods
Findings generalize well to U.S. population (less certain globally)
Kahneman and Deaton - High Income Improves Evaluation of life but not emotional well-being (Main Findings)
Money increases life satisfaction, but not daily happiness beyond ~$75K
Emotional well-being depends more on factors like health, relationships, and loneliness
Poverty worsens emotional suffering, especially in difficult life circumstances
Supports the idea that “money buys life satisfaction, but not happiness”
Kahneman and Deaton - High Income Improves Evaluation of life but not emotional well-being (Variables)
Independent Variable:Income
Dependent Variable: emotional well-being/ life evaluation
Kahneman and Deaton - High Income Improves Evaluation of life but not emotional well-being (Sampling Method)
Probability Sample: Selection is random, all members of population have an equal chance of being selected
Kahneman and Deaton - Author’s Approach to Theory
Deductive
Builds on existing theories and tests them with data
Univariate Analysis Ex
Frquency of political views
Histogram of arrival delays
Frequency of units taken per quarter
Bivariate Analysis Ex
Examining two variables
Is income related to happiness?
Gun ownership and gun deaths (scatterplot)
Miles per gallon and vehicle weight (scatterplot)
Hours of studying and grade point average
Multivariate Analysis Ex
Is alcohol consumption associated with GPA, independent of academic level?
academic salary by years since degree
academic salary by rank, sex, and years since degree
Telles & Lim (summary)
This study examines whether racial income inequality in Brazil changes depending on how race is measured: self-classification versus interviewer classification. Using a 1995 national survey, the authors find that interviewer classification produces larger white–nonwhite income gaps than self-classification. They argue that interviewer classification better captures racial discrimination because discrimination operates based on how others perceive race.
Telles & Lim (Hypothesis)
Hypothesis 1: Racial income inequality is greater when race is defined by interviewer classification than by self-classification.
Why: Discrimination depends on how others classify a person’s race, not how individuals identify themselves.
Hypothesis 2: Browns (mixed-race individuals) occupy an intermediate socioeconomic position between whites and blacks, but closer to blacks when using interviewer classification.
Why: Prior theory (e.g., Degler’s “mulatto escape hatch”) suggests browns may have intermediate status, but this depends on how race is socially perceived.
Hypothesis 3: Inconsistency between self- and interviewer classification systematically affects measured inequality.
Why: If individuals are classified differently depending on method, estimates of income gaps will shift.
Telles & Lim (Type of Research Design)
cross-sectional observational study
Uses national survey data (Brazil, 1995)
No experimental manipulation
Compares measurement systems (self vs interviewer classification)
Telles & Lim (Validity Assesment)
Internal Validity (Moderate–Low)
No random assignment or manipulation of race
Cannot definitively prove discrimination causes income differences
Possible omitted variables (social networks, discrimination history, regional effects)
Ecological Validity (High)
Uses real-world labor market data
Reflects actual social classification practices in Brazil
Captures lived racial ambiguity and interaction-based classification
External Validity (Moderate–High)
Large national sample (urban Brazil, ~4,000 cases)
Likely generalizable to urban Brazil
Less certain for rural Brazil or other countries, though authors suggest broader Latin American relevance
Telles & Limm Variables
Independent Variable:
Race (self-classified vs interviewer-classified)
Dependent Variable:
Income (ordinal categories transformed into log income via maximum likelihood estimation)
Control Variables:
Age, age squared
Sex
Education (primary, secondary, college)
Region (Northeast vs others)
Urban size (large vs small cities)
Telles & Lim Sampling Method
Sampling Method
National probability sample of urban Brazil (1995)
Multi-stage random sampling:
municipalities → neighborhoods → streets → individuals
Sample size: ~4,000 valid respondents (after exclusions)
Only urban population included (age 16+)
Stratified Random Sampling
Telles & Lim Main Findings
Income inequality depends significantly on how race is measured
Interviewer classification produces larger racial inequality estimates
Self-classification tends to underestimate discrimination-linked inequality
Racial categories in Brazil are fluid and socially constructed, not fixed
“Race” operates as a social perception variable influencing economic outcomes
Telles & Limm Ontological Approach?
Constructivist
Racial categories in Brazil are fluid and socially constructed, not fixed
Sampling Error Ex
Ex: 2016 presidential election (survey did not ask enough about education)
Pollsters did not ask for people’s education, non-educated people excluded from polls who favored Trump, voting patterns not accurately reflected
Non-response from Trump voters, if education was used as instrument in sampling, would have been more reflective of reality
Campus Dining Ex
They randomly select names from an email list provided by the student union - error (sample frame problem- list is incomplete, tool to gain data incorrect)
Student union - not everyone is apart of that list, list is incomplete only undergrads with membership to union, not reflective of total population
They randomly hand out survey leaflets only in the dorm cafeteria -bias
Accounting for only on-campus students, not commuters, also have a bias of students who are already there and are choosing/ may prefer food from cafeteria food over students who bring food back home
They use the official university enrollment database, which includes all current undergrads, and randomly select students from that complete list to invite to the survey. -none
Snowball Sample Ex
Refer you to more people that belong to same social group you are interested in
(ex: people in illegal activity, no list online/documented, to understand these experiences you need to find referrals)
(ex: anorexia, can’t assume they are listed with medical professionals/they may not be seeking help, referrals can help here)
(ex: disney world attendees, not a list easily available, may be easier to find somebody)
Stratified sampling
You divide the population into groups
Then you randomly select people from each group
Trying to be somewhat representative
Quota sampling
You divide into groups
Then you non-randomly choose whoever is easiest to find until the quota is filled
You can oversample something, doesn’t have to be proportionate, doesn’t have to be representative
Problem with Survey Research Ex (Trump Election)
Most republicans say they doubt the 2020 election, but how many really mean it?
In early surveys done immediately after the election, a lot of Republicans responded as if they believed Trump had lost
When it comes to state or local elections, Republicans felt fairly good about how their votes were handled in 2020
Regardless of their true beliefs, Republicans said what they thought they should say in certain contexts; a report not about what they thought happened, but about their beliefs about what should have happened (answers vs reality)
Reported Vs Actual Behavior
Participants reported they would be less likely to honk their horn at a higher status car (luxury car) that was blocking traffic compared to a lower-status car (an old, beat-up car)
Later, when placed in an actual driving situation with a confederate’s car blocking traffic, participants honked equally at both higher-status and lower-status cars
Reported Vs Actual Behavior (Gender Talkativeness)
A popular study from 2006 suggested a woman uses about 20,000 words per day while a man uses about 7,000
But no systematic study , this study was false
Accurate study: EAR (electronically activated recorder)
396 participants (210 women/ 186 men) - not quota surveying here
Women and men both use about 16,000 words per day (high individual variation)
There is more variation within genders than between
Your response to this question is based on gender stereotype you are familiar with (reported vs actual behavior)
Reported Vs Actual Behavior (Church Attendance)
When asked in 1991 (Gallup polls), about 42% of adult Americans said they went to church or synagogue in the last 7 days
But in 1990s major denominations were not thriving and church service attendance appeared to be down
36-37% of Protestant residents in Ohio and the Cincinnati area reported attending weekly
But 19.6% Protestant attendance based on estimates of actual attendance in that area
Actual Catholic attendance in US was 25% (rather than 51%)
Sampling Issue?
Shouldnt be - we have a lot of data
Research Method Issue?
Shouldnt be - we have good methods
This gap is between reported vs actual behavior
A sturctured observation of medication rounds ex.
Aim
▪ Describe current practice in administering medication
in acute psychiatric unit
▪ Sample
▪ Convenience – 3 acute mental health wards
▪ Method
▪ Structured observation of 20 medication rounds
▪ Results
▪ 97% nurses showed warmth with good eye contact
▪ 46% initiated provision of information
▪ 35% inquired about patient health
▪ 17% inquired about medication problems
▪ 42% nurse responded to patient requests for info
Conversational Group Study
Research question
▪ Are there constraints on size and structure of
conversational groups?
Sample
Convenience – 3 public settings
▪ 802 “cliques”
Method
▪ Structured observation; 15-minute intervals
Results
▪ 54% of cliques involved 2 people
▪ 27% involved 3 people
▪ 13% involved 4 people
▪ < 6% involved 5-7 people
on average groups are 3-4 people, typical size of conversation group found using this method
Reactive Effects EX
Research question
▪ Do doctors change prescribing behavior during observational studies of medical
activity?
▪ Method
▪ Study of medical records at T1 + Observation study
▪ Study of medical records at T2 (1 year later)
▪ Results
▪ Inappropriate prescribing was significantly lower during observational study
(–29%!)
Stivers & Majid (2007) Pediatric Interaction Summary
This study analyzes how pediatricians decide whether to address questions to children or parents during medical visits, and how this depends on question type, child characteristics, and parent demographics. It argues that speaker selection in interaction reflects implicit judgments about competence and authority, and may contribute to healthcare inequality.
Stivers & Majid (2007) Pediatric Interaction Typoe of Study/ Research Design
Quantitative observational study
Structured naturalistic interaction analysis
No manipulation of variables (non-experimental)
Uses coded video recordings of real-world behavior
Stivers & Majid (2007) Pediatric Interaction Type Sampling Method
Non-experimental convenience/field-based sampling of clinics
Pediatricians and visits drawn from community practices in LA County
Not nationally representative
Clustered naturally around participating physicians and clinics
Stivers & Majid (2007) Pediatric Interaction Variables
Dependent Variable
Question addressee selection
Parent (mother/father)
Child
Independent Variables
Question content (medical vs social)
Child age
Parent race (e.g., Black, Latino)
Parent education level
Father present (yes/no)
Doctor characteristics (race, gender)
Child gender
Interaction terms (e.g., race × education)
Stivers & Majid (2007) Pediatric Interaction Type Sampling Method
1. Addressee patterns
60% of questions → parent (mostly mother)
37% → child
2. Content effect
Social questions → heavily directed to parents (+500% likelihood)
Medical questions → more often directed to children
3. Child competence effect
Older children more likely to be addressed (+22% per year)
4. Social inequality effects
Parent identified as Black → significantly less child-directed questioning (–78%)
Higher parent education reduces this disparity (interaction effect +32–56%)
5. No significant effects
Doctor race
Doctor gender
Child gender
Parent education alone
Stivers & Majid (2007) Pediatric Interaction Key Findings
1. Addressee patterns
60% of questions → parent (mostly mother)
37% → child
2. Content effect
Social questions → heavily directed to parents (+500% likelihood)
Medical questions → more often directed to children
3. Child competence effect
Older children more likely to be addressed (+22% per year)
4. Social inequality effects
Parent identified as Black → significantly less child-directed questioning (–78%)
Higher parent education reduces this disparity (interaction effect +32–56%)
5. No significant effects
Doctor race
Doctor gender
Child gender
Parent education alone
Stivers & Majid (2007) Pediatric Interaction Validity
Internal Validity (Moderate–High)
Strength: real-time behavioral data (not self-report)
Controls for many confounders in regression model
Limitation: observational design → cannot prove causation (only association)
Ecological Validity (Very High)
Real pediatric consultations
Natural interaction setting (no lab simulation)
Captures authentic communication behavior
External Validity (Moderate)
Limited geographic scope (LA County only)
Small number of physicians (38)
Likely not fully generalizable across all healthcare systems or countries