Notes on Research Methods: Naturalistic Observation, Surveys, and Data Interpretation
Naturalistic observation (non-experimental): study design where researchers observe subjects in their natural environment without intervening or manipulating variables.
Purpose: understand behavior in real-world settings when manipulation is not feasible or ethical.
Example from transcript: observing vehicle speeds by color (red vs white) using a clipboard and speed guns to see if there’s a difference in speed without telling people what to do.
Key idea: you watch and record what participants do, not what you assign them to do.
Hawthorne effect (a major limitation of observation): the act of being observed changes participants’ behavior.
Classic finding: people alter their behavior because they know they are being watched.
Manifestation: in a lab/monitoring situation, participants may adjust behavior to appear as they think observers expect.
In naturalistic settings, the effect may be less pronounced but still present; depends on how overt the observation is and how variables are defined.
Example from transcript: observing drivers on a highway can cause them to speed up or act differently once they know they are under observation.
Operational definitions and observation: how you define and observe variables affects Hawthorne effect and study outcomes.
Two levels of definition:
Theoretical (vague) definitions of variables of interest (e.g., risk-taking, aggression).
Operational definitions (how you measure or observe those variables in a study).
Different methods can yield different observed relationships depending on how variables are operationalized.
Examples of naturalistic observation vs. intervention-based study:
Naturalistic example (non-manipulated): observe purchase behavior of car buyers and measure color choice (red vs white) and speed on highways to infer stereotypes or biases.
Intervention example (manipulated): an experiment would require you to assign conditions (e.g., restrict color choices) and observe outcomes, which is often impractical for real-world purchase behavior.
Surveys as a non-experimental method:
Purpose: capture a wide range of behaviors without requiring controlled manipulation.
Benefits:
Can be conducted quickly to collect many data points.
Allows multiple tests over time to capture variability (not just a single test).
Self-report measures can assess attitudes, risk-taking, behaviors, etc.
Example from transcript: a risk-taking item where a participant chooses between a guaranteed payoff and a gamble (e.g.,
gamble: 50% chance of $20; sure payoff: $10).
This helps infer risk propensity without manipulating real-world financial decisions.
Caveat: surveys cannot establish causality; they reveal associations and correlations.
Strategy for robustness: use multiple tests across time to account for day-to-day variation (e.g., sickness, mood).
Risk-taking example in surveys:
Question: Would you prefer a guaranteed $10 payoff or a 50% chance of winning $20?
Interpretation: willingness to take the risk indicates higher risk-taking propensity.
Mathematical framing: expected value of the gamble is EV = p imes ext{gain} + (1-p) imes ext{loss} = 0.5 imes 20 + 0.5 imes 0 = 10. This equals the sure payoff, illustrating how risk preference (not just EV) drives choice.
Preference for multiple measurements over a single test:
Rationale: individuals’ performance can vary by day due to health, mood, etc.
Therefore, using a variety of tests and repeated measures yields a more reliable assessment of a trait or behavior.
Operationalization and variability in studying the same concept:
Concrete vs. abstract definitions: two researchers may describe the same variable differently, yet their operational definitions determine how the variable is measured.
Example: testing whether pets increase health via an actualistic study (observe people with their pets) vs other observational approaches.
Data presentation and interpretation:
Data should be presented in two forms: numerically and visually.
Frequency distribution (visual): shows how many responses fall into each category or score on a scale.
Visual aids help those who respond better to pictures or graphs.
Measures of central tendency: mean, median, and mode:
Mean (average):
Population mean: oxed{\nabla \, ar{x} = rac{1}{N} \u2602
}
Note: In proper notation, population mean is \mu = rac{1}{N}
sum{i=1}^N xi and sample mean is ar{x} = rac{1}{n}
sum{i=1}^n xi.Median: middle value of ordered data; if even number of observations, median is the average of the two middle values.
Mode: most frequent value in the data set.
Understanding data shape and its impact on statistics:
Skewness describes asymmetry of the distribution:
Positive skew: tail on the right; mean > median > mode.
Negative skew: tail on the left; mean < median < mode.
Example discussed: a sample with most scores near higher end but a few extreme low scores can pull the mean downward, resulting in misleading perception if only the mean is considered.
Visual example discussed: data that is not symmetric can appear negatively skewed when most responses cluster at higher end with a long tail to the left.
Variability and dispersion:
Standard deviation captures how spread out the data are around the mean.
Two comparative examples:
Class A: data more spread out (higher standard deviation).
Class B: data clustered around the mean (lower standard deviation).
Conceptual takeaway: even with the same mean, two datasets can differ substantially in dispersion.
Practical example with a scale (0 to 4):
Consider scores on a scale from 0 to 4.
If most students score around a single value but a few extreme values exist (e.g., 0s or 4s), the mean may not reflect the typical experience due to skew or outliers.
Implication for population inference: when making inferences about a population (e.g., psychology 101 students), you must decide how far you want to generalize:
Should you include results from different semesters (fall, winter, spring)?
Should you generalize to students at multiple universities or globally? These questions relate to external validity and generalizability and are constrained by practical sampling limits.
Population vs. sample and generalization:
Population: the entire group you want to understand (e.g., all psychology 101 students).
Population size can be enormous or even undefined (e.g., all psych 101 students across all universities).
In practice, you cannot access the entire population; you collect data from a subset called a sample.
Inferential statistics seek to make generalizations from the sample to the population, but only to the extent that the sample is representative and the sampling method is appropriate.
Question of scope in sampling: how far up the “chain” of population should you generalize (e.g., one course at one university vs. all universities, semesters, regions, etc.)?
Key takeaways for study design and interpretation:
When manipulation is not feasible, use naturalistic observations and/or surveys to gather data.
Be mindful of the Hawthorne effect and how observational context may alter behavior.
Clearly define variables both theoretically and operationally to ensure consistent measurement and interpretation.
Use multiple measures and repeated assessments to capture variability and reduce measurement error.
Present data both numerically and visually for accessibility and comprehension.
Recognize the limitations of the mean in skewed distributions and consider median and mode, as well as dispersion measures like standard deviation.
Understand and articulate the difference between population and sample, and the limits of generalization from a given study.
Quick recap of formulas and terms to remember:
Population mean: oxed{\,\mu = rac{1}{N}
sum{i=1}^N xi \,}Sample mean: oxed{ar{x} = rac{1}{n}
sum{i=1}^n xi \,}Standard deviation (sample): oxed{s = \,
obreak \, \sqrt{\frac{1}{n-1}
sum{i=1}^n (xi - ar{x})^2}}Expected value of a gamble: EV = p imes ext{gain} + (1-p) imes ext{loss} e.g., for a 50% chance to win $20 and otherwise win $0: EV = 0.5 \times 20 + 0.5 \times 0 = 10.