Critical Thinking, Measurement, and Observational Methods in Psychology

Truth, Evidence, and How We Know Things

Opening idea: How do we know something is true? We test it, but real life relies on more than just testable evidence.
Sources people rely on:
- Testing and empirical evidence (scientific approach)
- Authority (someone told me it’s true)
- Intuition (felt right, gut feeling)
Practical exercise: evaluating truth with intuition
- You’ll write down whether each statement is true or false using gut judgment.
- Four phrases explored:
- Expressing pent up anger reduces our anger — True (the instructor indicated this as true and used a facial cue).
- Opposites attract — (discussion occurs; commonly debated; not definitively established in the transcript).
- We are not very good at predicting what will make us happy — True.
- Birds of a feather flock together — True (we tend to like people who are like us).
Real-life example illustrating limits of intuition:
- Electroconvulsive therapy (ECT) for treatment-resistant depression is presented as effective with some risks (memory loss as a side effect). The speaker notes that despite effectiveness, one should be cautious about intuitive judgments in controversial or high-stakes medical decisions.
- The point: personal experience or surface-level conclusions can mislead us about what’s true; careful evidence is needed.
Core aim: become better critical thinkers by recognizing biases and limitations of intuition.

Cognitive biases introduced in the section

Hindsight bias: after learning the outcome, we believe we could have foreseen it; knowing the end changes how we perceived prior steps.
Overconfidence bias: people who know a lot and those who know little both show high confidence; we often overestimate our understanding.
Confirmation bias: tendency to seek information that confirms current beliefs and ignore disconfirming evidence; amplified by social media algorithms.
Illusory correlation: perceiving relationships where none exist (e.g., superstition or rituals affecting performance).
Social media context: algorithms feed similar content, reinforcing beliefs and suggesting others share our views, which can inflate perceived consensus.
Critical thinking hack: seek information from multiple perspectives and make informed judgments rather than accepting a single viewpoint.
Relevance to decision making: biases affect educators, detectives, and others where biased interpretation can have serious consequences.

Empiricism, the Scientific Method, and Observation

Personal experience and observation are the roots of science; empiricism is the belief that accurate knowledge comes from observation and data.
The scientific method provides a systematic process to study ideas and gather evidence:
- Start with observations
- Review existing literature
- Consider overarching theories
- Formulate hypotheses to explain phenomena
Are humans like chemical compounds in study? Not quite:
- Humans are variable; individual differences matter (e.g., color preferences, personality).
- This variability makes studying human behavior more complex than studying inanimate objects.
Reactivity: people alter their behavior when they know they’re being studied; self-awareness and observation can change results.
Demand characteristics: participants alter responses to fit perceived expectations of the study.
Naturalistic observation: observing behavior in natural environments to reduce reactivity and demand characteristics, at the cost of less experimental control.
Internal vs. external validity:
- Internal validity: how well a study isolates the cause-and-effect within the study
- External validity: how well findings generalize to the real world
Blinding and double-blind procedures:
- Blinding reduces observer bias (expectations influencing observations)
- Double-blind: neither the participant nor the researcher knows who receives which condition (e.g., drug vs placebo) to minimize bias

Measurement and Operationalization

The need to define abstract constructs in concrete, measurable terms (operational definitions).
Examples of abstract terms and how to operationalize them:
- Happiness: describe through observable indicators such as daily mood tasks, frequency of positive events, or task-related energy—each method yields a measurable proxy rather than an absolute, universal definition.
- Intelligence: difficult to define purely; in practice, tests aim to measure related constructs (e.g., problem-solving ability, learning rate, processing speed) and require careful validation.
- Shyness/antisocial behavior: can be operationalized via observable behaviors or social interaction metrics.
Why operationalization matters:
- For decisions in education, psychology, or research, tools must be reliable and valid.

Reliability and Validity in Measurement

Reliability: consistency of a measurement tool; it should yield the same result under consistent conditions.
- Analogy: a reliable friend is stable and predictable; you can depend on their behavior.
Validity: whether the tool measures what it is supposed to measure; the meaning and usefulness of the score.
- Example: standardized tests like the Indiana state assessment (iSTEP) are valid for measuring achievement in reading and math, but using their scores to judge school performance or teacher effectiveness may be an invalid application.
Relation between reliability and validity:
- A measure must be reliable to be valid, but reliability alone does not guarantee validity; it must also measure the intended construct.
Visual analogy: reliability = hitting the same spot consistently; validity = hitting the right spot for the intended measurement.

Observing, Measuring, and Describing Data

Data can be described through graphic representations that communicate information succinctly.
Frequency distribution example (illustrative): a distribution of fine motor skill scores by gender suggests differences that can be visualized and interpreted.
Normal distribution (bell curve):
- Most scores cluster around the mean with fewer extremes
- Typical statements (not memorized here):
- About $68\%$ of scores fall within one standard deviation of the mean
- About $95\%$ fall within two standard deviations of the mean
- The zero point represents the average score (the mean, $\mu$ )
Non-normal distributions:
- Skewed distributions: most scores cluster toward one end; e.g., negatively skewed vs positively skewed distributions describe where most scores lie relative to the mean.
Note: The specific numeric breakdowns are meant to aid understanding of distribution shapes rather than require memorization for the exam.

Practical Considerations in Research Design

Trade-offs in observation:
- Naturalistic observation increases external validity but reduces internal control
- Controlled settings increase internal validity but may reduce external applicability
Observer bias: expectations can color observations; mitigated by blinding and standardized coding schemes
Reproducibility and transparency: important for scientific credibility

Connections to Core Goals and Real-World Relevance

The overarching aim is to cultivate critical thinking: recognizing biases, understanding how we know what we know, and applying rigorous methods to study human behavior.
Real-world relevance:
- Educational testing and decision-making rely on measurement properties (reliability, validity)
- Media literacy and critical consumption of information combat confirmation bias and illusory correlations amplified by social media
- Ethical implications arise when misusing tests or overinterpreting data in education, law enforcement, or medicine

Foundational Terms and Quick Reference

Empiricism: knowledge through observation and experience
Observation: using senses to detect and describe phenomena
Operational definition: concrete, measurable definition of an abstract concept
Reliability: consistency of a measurement
Validity: accuracy and usefulness of what a measurement purports to measure
Internal validity: confidence that observed effects are due to the manipulated variables within the study
External validity: generalizability of findings to real-world contexts
Reactivity: participants alter behavior because they know they are being observed
Demand characteristics: cues in an experiment that influence participants to act in expected ways
Naturalistic observation: observing behavior in natural environments with minimal interference
Normal distribution: bell-shaped distribution where most observations cluster around the mean
Skewness: asymmetry in the distribution of data
iSTEP: Indiana’s standardized test designed to measure achievement (valid for its intended purpose but not for all evaluative conclusions about schools or teachers)

Key Formulas and Quantitative References

Normal distribution properties (illustrative, not memorization required):
- $P(|X-\mu| \le \sigma) = 0.68$
- $P(|X-\mu| \le 2\sigma) = 0.95$
Observational notation (conceptual):
- Mean: $\mu$
- Standard deviation: $\sigma$

Note on the Lecture’s Structure and Activities

The speaker plans to cover methods of observation today and methods of explanation in a future session.
Group activity described: discussing abstract terms (e.g., happiness, intelligence, shyness) and articulating concrete, measurable indicators for each.
Ethical and epistemic caution: even widely used tools require careful consideration of what they actually measure and how results are used.