Psychological Research Methods

The Scientific Method

Psychology's Foundation: Relies on the scientific method to study human behavior and mental processes.
Types of Reasoning:
- Deductive Reasoning: Starts with general ideas or theories and moves to specific observations or conclusions.
- Inductive Reasoning: Starts with specific observations and builds up to broader generalizations or theories.
- Hypothetico-Deductive Reasoning: A blended model that begins with an educated guess or question (hypothesis) and tests it through small, controlled observations.

Theory and Hypothesis

Theory:
- An orderly, integrated set of statements that describes, explains, and predicts behavior.
- Helps understand development to improve the welfare and treatment of individuals (e.g., children).
- Depends on scientific verification.
- Characteristics of a Good Theory:
  - Parsimonious: Simple and concise, explaining phenomena with the fewest assumptions.
  - Falsifiable: Capable of being tested and proven wrong.
  - Heuristic Value: Suggests new research questions and leads to further discoveries.
Hypothesis: A specific prediction drawn directly from a theory, designed to be tested.
Scientific Method Flow:
- Theories (e.g., Sleep boosts memory) lead to Hypotheses (e.g., When sleep deprived, people remember less from the day before).
- These hypotheses are tested through Research and Observations (e.g., giving study material before ample sleep vs. shortened sleep and testing memory).
- The results of research and observations either confirm, reject, or revise the initial theories.

Reliability and Validity

Reliability: The ability of a measure to consistently produce the same result under the same conditions.
- Types of Reliability:
  - Inter-rater Reliability: The degree of agreement between multiple observers or raters.
  - Internal Consistency: The degree of correlation between different items on a single survey or test, indicating that all items measure the same construct.
  - Test-retest Reliability: The consistency of a measure's results over multiple administrations to the same individuals at different times.
Validity: The extent to which a tool or test instrument measures what it is intended to measure.
- Types of Validity:
  - Ecological Validity: The degree to which research findings can be generalized to real-world settings and situations.
  - Construct Validity: The degree to which a test measures the theoretical construct it claims to measure.
  - Face Validity: The degree to which a variable or measure appears, on the surface, to be a valid measure of the construct.

Research Methods vs. Research Designs

Research Methods: The specific activities participants engage in during a study.
- Examples: Taking tests, answering questionnaires, responding to interviews, being observed.
Research Designs: The overall plans for research studies that permit the best possible test of the hypothesis.

Common Research Methods

Systematic Observation:
- Naturalistic Observation: Observing behavior in its natural environment.
- Structured Observation: Observing behavior in a controlled laboratory setting where conditions are uniform for all participants.
- Limitations:
  - Observer Influence: The presence of an observer may alter participants' behavior.
  - Difficulty in determining the cause of the observed behavior.
Self-Reports:
- Clinical Interview: A flexible interviewing procedure where the investigator obtains a complete account of the participant's thoughts.
- Structured Questionnaires: Written questions with a defined set of answers.
- Structured Interview: A self-report instrument where each participant is asked the same questions in the same way.
- Limitations:
  - Not useful with very young children.
  - Concerns about honesty and accuracy of responses.
  - Potential misinterpretation of questions by participants.
Clinical, or Case Study, Method:
- Provides a full, in-depth picture of a single individual's psychological functioning.
- Includes interviews, observations, and sometimes test scores.
- Limitations:
  - Difficult to make comparisons across individuals.
  - Lack of generalizability to other populations due to the unique nature of the individual studied.
Ethnography:
- A descriptive, qualitative technique aimed at understanding a culture or a distinct social group.
- Conducted through participant observation, involving months or years of participation in the daily life of the cultural community.
- Limitations:
  - Highly subjective nature of the observations and interpretations.
  - Limited generalizability to other cultures or groups.
Psychophysiological Methods:
- Goal: To understand biological processes involved in perception, cognition, and emotion.
- Measures Used: Heart rate, Event-Related Potentials (ERPs), Functional Magnetic Resonance Imaging (fMRI), eye tracking.
- Limitations:
  - Expensive equipment and procedures.
  - May be difficult to determine which specific aspect of a stimulus drives a biological response.
  - Susceptible to interference from other uncontrollable biological processes.
Archival Research:
- Employs existing data (e.g., hospital records, school achievement tests, census data).
- Benefits:
  - Cost-effective.
  - Allows the study of long-term trends.
  - Can utilize large and representative samples.
- Limitations:
  - Records may be incomplete.
  - Limits study to only available data; researchers cannot collect new data.
  - Lack of control over the original method of data collection.
  - Ethical concerns may arise from using sensitive past data.
Meta-analysis:
- A statistical technique that combines the results of many independent studies on a similar topic.
- Benefits:
  - Increases statistical power by pooling data.
  - Helps provide more consistent results and a clearer picture of phenomena.
  - Useful for evidence-based practice and policy decision-making.
- Limitations:
  - "Bad studies may imply bad results" (garbage in, garbage out).
  - Can be difficult to compare study results if there are significant differences in study designs or methodologies.

General Research Designs

Correlational Design:
- Information on individuals is gathered in natural life circumstances without altering participants' experiences.
- Reveals relationships (associations) between participants' characteristics and their behavior or development.
- Does not permit the researcher to infer cause and effect.
- Correlation Coefficient ( $r$ ):
  - A number describing how two variables are associated with each other.
  - Magnitude (size) shows the strength of the relationship (closer to $+1$ or $-1$ is stronger, closer to $0$ is weaker).
  - Sign ( $+$ , $-$ ) shows the direction of the relationship.
    - Positive Correlation: As one variable increases, the other also increases (e.g., weight and height).
    - Negative Correlation: As one variable increases, the other decreases (e.g., tiredness and hours of sleep).
    - No Correlation: No systematic relationship between the variables (e.g., shoe size and hours of sleep).
- Scatterplot:
  - A graphed cluster of dots, where each dot represents the values of two variables.
  - Slope: Shows the direction of the relationship.
  - Scatter: Shows the strength of the correlation (little scatter indicates high correlation).
Correlation and Cause-Effect:
- Correlation indicates the possibility of causation but does not prove it.
- Challenges in Inferring Causation:
  - Third Variable Problem: An unmeasured third variable might be causing the observed correlation (e.g., distressing events or biological predisposition causing both low self-esteem and depression).
  - Directionality Problem: It's unclear which variable causes the other (e.g., does low self-esteem cause depression, or does depression cause low self-esteem?).
  - Example: Obesity and TV watching might be correlated, but this doesn't automatically mean TV watching causes obesity, or vice-versa; a third factor like diet or activity level could be at play.
Experimental Design:
- A research design where the investigator manipulates an independent variable to determine its effect on a dependent variable.
- Participants are assigned to control and experimental conditions, typically by chance.
- Control over Participant Awareness:
  - Double-blind Procedure: Neither the participants nor the researchers know which group (experimental or control) participants are in, minimizing bias.
  - Single-blind Procedure: Participants do not know which group they are in, but the researchers do.
- Sometimes combined with matching participants on key characteristics (e.g., age, gender, aggression level) to ensure groups are comparable.
- Key Variables:
  - Independent Variable (IV): The variable that the investigator manipulates and expects to cause changes in another variable.
  - Dependent Variable (DV): The variable that the investigator expects to be influenced by the independent variable; the outcome measured.
  - Confounding Variable: A variable so closely associated with the independent variable that the researcher cannot tell which one is responsible for changes in the dependent variable.
- Random Assignment: An unbiased procedure to more equally distribute participant characteristics (including potential confounding variables) across treatment groups.
- Ecological Validity: In experimental designs, this assesses whether conclusions drawn from controlled laboratory studies apply to the real world.
- Example: TV Violence and Aggression
  - Independent Variable: Type of TV show watched (violent vs. non-violent).
  - Dependent Variable: Number of aggressive behaviors shown by each group after the show.
Modified Experimental Designs:
- Field Experiments: Make use of rare opportunities for random assignment in natural settings, increasing ecological validity.
- Natural, or Quasi-, Experiments: Compare differences in effect where the independent variable is determined by the nature or circumstances of treatments that already exist, and the researcher does not manipulate it (e.g., comparing existing educational interventions).

Designs for Studying Development

Longitudinal Design:
- Definition: Participants are studied repeatedly over an extended period, and changes are noted as they get older.
- Cohort Effects: A limitation where children born at the same time are influenced by particular cultural and historical conditions that may not apply to children developing at other times.
- Advantages:
  - Allows testing the stability or instability of traits over time.
  - Provides insights into individual as well as group developmental trends.
  - Can help in identifying predictors and inferring cause-effect relationships (though still challenging).
- Disadvantages:
  - Time-consuming and expensive.
  - Biased Sampling: Participants who continue in the study may be unique.
  - Outdated Theory: The original theoretical framework may become obsolete during the long study period.
  - Cohort Effect: Specific to the generation being studied, limiting generalizability.
  - Selective Attrition: Participants dropping out over time, potentially biasing the sample.
Cross-sectional Design:
- Definition: Groups of people differing in age are studied at the same point in time.
- Advantages:
  - Relatively quick and cheap to conduct.
  - Provides information about group (age-related) differences.
  - Not affected by selective attrition or practice effects from repeated testing.
- Disadvantages:
  - Does not provide evidence about change at the individual level.
  - Also subject to cohort effects, as different age groups come from different generations.
Sequential Designs:
- Definition: Researchers conduct several similar cross-sectional or longitudinal studies (called sequences), combining elements of both approaches.
- Purpose: To determine if cohort effects exist and study longer developmental periods in less actual time.
- Advantages:
  - Efficient, allowing for both longitudinal and cross-sectional comparisons; if results are similar, conclusions are stronger.
  - Can directly determine whether cohort effects are operating.
- Disadvantages:
  - Still time-consuming.
  - Can be expensive.
- Example: Following three cohorts (e.g., born 2005, 2006, 2007) longitudinally for three years. Testing them at overlapping ages lets researchers check for cohort effects by comparing participants born in different years but reaching the same age. This setup can infer a developmental trend across five years (e.g., ages 11 to 15) in just three years of actual study.
Microgenetic Design:
- Definition: An adaptation of the longitudinal approach where children are presented with a novel task, and their mastery is tracked over a series of closely spaced sessions.
- Often, research combines various experimental strategies.

Statistical Reasoning in Everyday Life: Statistical Literacy

Statistical Literacy:
- Involves understanding statistics and what they mean.
- Crucial for making informed decisions (e.g., understanding COVID-19 risks and vaccine protectiveness).
- Helps combat statistical misinformation, which often arises from off-the-top-of-the-head estimates or big, round, undocumented numbers (e.g., "10 percent of brain used," "10,000 steps a day").
- Example (Vaccines): Distinguishing between a few "breakthrough" cases in a large vaccinated group versus a high mortality rate in a small unvaccinated group requires statistical literacy to correctly interpret risks.

Statistical Reasoning in Everyday Life: Descriptive Statistics

Descriptive Statistics: Use of statistical methods to provide a simple summary of data.
Bar Graphs:
- Easy to design a graph to make a difference appear big or small by manipulating the vertical scale (y-axis).
- Key: When interpreting graphs, always consider the scale labels and note their range to avoid misinterpretation.

Statistical Reasoning in Everyday Life: Measures of Central Tendency

Measures of Central Tendency: A single score that represents a set of scores.
- Mode: The most frequently occurring score(s) in a distribution.
- Mean: The arithmetic average of a distribution, obtained by adding all scores and then dividing by the number of scores. Can be distorted by a few atypical or extreme scores (outliers).
- Median: The middle score in a distribution; half the scores are above it, and half are below it. Less affected by outliers than the mean.
- Skewed Distribution: In a skewed distribution (e.g., income), the mean, median, and mode may differ significantly. For instance, a few very high incomes can pull the mean upward, making it a less representative measure than the median.

Statistical Reasoning in Everyday Life: Measures of Variation

Measures of Variation: Reveal the similarity or diversity in scores within a distribution.
- Range: The difference between the highest and lowest scores in a distribution.
- Standard Deviation: A computed measure of how much scores differ from the mean score. A larger standard deviation indicates greater variability.
- Normal Curve (Normal Distribution):
  - A symmetrical, bell-shaped curve that describes the distribution of many types of data.
  - Most scores fall near the mean (about $68$ percent fall within one standard deviation of it), and fewer and fewer occur near the extremes.
  - Example: Scores on aptitude tests (like the Wechsler Adult Intelligence Scale) tend to form a normal curve, with an average score typically set at $100$ .

Inferential Statistics

Purpose: Used to determine whether an observed difference in a sample can be generalized to other populations.
Statistical Significance:
- A statistical statement of how likely it is that an obtained result (such as a difference between samples) occurred by chance.
- Assumes there is no difference between the populations being studied (the null hypothesis).
Principles for Inferring Population Difference from Sample Difference:
- Representative Samples: Are better than biased (unrepresentative) samples.
- Bigger Samples: Are better than smaller ones, as they tend to be more representative.
- More Estimates: Are better than fewer estimates; generalizations based on a few unrepresentative cases are unreliable.
Statistical Testing:
- Researchers use statistical tests to estimate the probability of the result occurring by chance, assuming the null hypothesis (that no difference exists between groups) is true.
- When estimates are reliable and the difference between them is relatively large, it is more likely that the difference is statistically significant.
- $p$ -values: Indicate the probability of observing the result, given the null hypothesis is true.
- Strong evidence to reject the null (no-difference) hypothesis occurs when the probability ( $p$ -value) of that result is very low, usually set at less than $5$ percent (p < .05).
Statistical Significance vs. Practical Significance:
- Statistical significance indicates the likelihood that a result would have happened by chance if the null hypothesis were true.
- A statistically significant result may have little practical significance (e.g., a tiny effect size).
- It does not say anything about the importance or magnitude of the result in a real-world context.

Ethical Research Guidelines

Overarching Principle: Protect the participants (human or non-human animals) involved in the study.
Research Ethics Board (REB):
- An ethics oversight group that evaluates research to protect the rights of participants.
- Composed of a mixture of researchers from inside and outside of the specific field (e.g., an Interdisciplinary Committee on Ethics in Human Research (ICEHR) or a Health Research Ethics Board (HREB)).
REB Requirements for Human Participants:
- Informed Consent: Researchers must provide as much information as possible (purpose, procedure, risks, benefits) so that people can make informed decisions to participate. Participants must be of majority age.
- Protection from Harm: Participants must be protected from medical and physical risk, as well as undue emotional stress.
- Confidentiality: Participant identities and their personal data must be protected.
- Voluntary Participation: Participation must be completely voluntary, and participants must be able to end their involvement at any point during the study without penalty.
- Complete Disclosure/Debriefing: Researchers cannot deceive people. If a study requires deception, participants must be fully debriefed at the end, explaining the true purpose and addressing any misconceptions.
Ethical Guidelines for Research with Non-Human Animals (as they cannot provide consent):
- Use for research, teaching, and testing is acceptable only if it contributes to understanding fundamental biological principles or to knowledge that can be expected to benefit animals and humans.
- Only if researchers' best efforts to find an alternative sample (that does not involve animals) have failed.
- Researchers must employ the most humane methods, using the smallest number of appropriate subjects to obtain valid information.
- Efforts must be made to limit pain and distress and ensure proper recovery periods for the animals.