Data analysis planning and reproducibility in science we start with planning how to analyze data rather than throwing together analyses based on

Data analysis planning and reproducibility in science

  • We start with planning how to analyze data rather than throwing together analyses based on preferences; documentation and writing are essential for clarity.

  • Experimentation involves manipulating one or two specific factors in a situation and reporting carefully how other factors change.

  • Reproducible data are central: scientists aim to communicate results and, crucially, how those results were obtained so others can double-check the work.

  • Replication is the process of applying the original methodology (or a close variation) to see if the same results emerge, reinforcing the validity of conclusions.

  • Papers typically include sections detailing the questions asked, data sources, and the steps undertaken so others can reproduce the study.

  • Science is a deeply collaborative enterprise across disciplines, with progress built on the cumulative work of many researchers rather than single “great minds.”

  • The scientific method in psychology mirrors other sciences: theory → hypothesis → data → analysis → interpretation.

  • A theory is a broad, well-supported explanation backed by extensive observation; it’s more rigorous than colloquial uses of the word "theory".

  • From a theory, we derive a hypothesis to test in specific situations.

  • Data are collected, analyzed, and interpreted to evaluate the hypothesis; inductive reasoning underpins this process, based on samples.

  • This lecture outlines a sequence: samples, inductive reasoning, hypotheses, data collection, data analysis, and interpretation.

The scientific method: theory, hypothesis, data, analysis, interpretation

  • Theory: overarching idea supported by extensive testing and observation; aims to say something broad about a phenomenon.

  • Hypothesis: a specific, testable prediction about the relationship between two or more variables.

  • Data: collected to test hypotheses; can be quantitative (numbers) or qualitative (narratives).

  • Analysis and interpretation: researchers analyze data to draw conclusions about whether the hypothesis is supported or not.

  • Inductive reasoning: reasoning from specific observations or samples to general conclusions about a population; results are probabilistic and come with uncertainty.

  • Distinction from deductive reasoning: deductive reasoning starts with a general rule to deduce specifics; inductive reasoning generalizes from specifics to broader claims.

  • Example of inductive inference: predicting a Syracuse winter will have snow based on a subset of observed winters; all such in-sample observations carry some uncertainty about future outcomes.

  • Uncertainty and margins of error are inherent in inductive conclusions; even strong historical patterns do not guarantee future events.

  • Researchers often report a margin of error or probability indicating how likely it is that the observed effect reflects a true relationship rather than random variation.

Samples, populations, and representativeness

  • When asking about human thoughts, emotions, or behavior, researchers aim to generalize to larger populations (e.g., humans, Americans, English speakers, students at Syracuse University).

  • It is generally impractical to survey or test everyone; instead, researchers study a sample.

  • Sample: a subset of the population that is studied to make inferences about the population. Example: asking about employment with a sample of 300 people rather than 2{,}000{,}000.

  • Population vs. sample: Population is the entire group of interest; sample is a smaller subset studied to draw conclusions.

  • Representativeness: a sample should have a demographic mix similar to the population of interest; otherwise, generalizations may be biased.

  • College students as a sample can differ from the general population in age, employment status, income, and other demographics.

  • Implications: findings from college student samples may not fully generalize to the broader population; researchers must consider how representative their sample is and discuss limitations.

  • Example concerns: younger age distribution among students affects questions about political views, health concerns, sleep, exercise, and diet; employment status can affect answers about work-related questions.

  • The census is one way governments collect data from a very large population, illustrating the scale needed for broad generalizations.

Inductive vs. deductive reasoning

  • Deductive reasoning (Sherlock Holmes style): start with broad observations or rules and deduce specific conclusions from them.

  • Inductive reasoning: start with specific observations (often from a sample) and infer general conclusions about the population.

  • In science, conclusions are drawn from sampled data to generalize about populations, with acknowledged uncertainty.

  • Example of inductive inference: “It will snow in Syracuse this winter” based on observed winters and lived experience; the conclusion is probabilistic, not certain.

  • Important caveat: conclusions are always conditional on the sample and data; they are educated guesses subject to revision with new information.

Hypotheses: testable and falsifiable

  • A hypothesis is a specific, testable prediction about the relationship between two or more variables that can be measured or categorized.

  • Measurable variables: anything that can be quantified or categorized (events, conditions, characteristics, behaviors).

  • Testable: the hypothesis must be measurable in a way that yields numerical data or categorical data.

  • Non-ideal hypotheses: statements about metaphysical or intangible notions are difficult to test and are poor scientific hypotheses.

  • Falsifiability: a hypothesis must have a defined failure state—clear criteria for what would count as disproving it.

    • Example: “There is a relationship between how late it is in the evening and how sleepy people report being.” A falsifying outcome would be finding no relationship between time of evening and sleepiness.

  • Some claims are difficult to falsify (e.g., certain conspiracy theories) because there is no agreed-upon way to disprove them; this is a mark of pseudoscience.

  • Distinguishing science from pseudoscience:

    • Scientific claims are testable, potentially falsifiable, and open to replication and scrutiny by others.

    • Pseudoscience often resists criticism, relies on anecdotes, and claims that cannot be disproven or updated with new data.

  • How to evaluate scientific claims:

    • Look for falsifiability and controlled testing (e.g., randomized controlled trials).

    • Compare claims that rely on controlled experiments versus anecdotes or selective data.

    • Be cautious of claims that rely on selective data or that cannot be tested or challenged.

  • Illustrative practice questions (testable and falsifiable or not):

    • The moon is made of green cheese: Testable and falsifiable via moon rocks and spectroscopy.

    • Depression caused by invisible gremlins in the brain: Not testable in a way that can be falsified; not a robust scientific claim.

    • Cupcakes are delicious: Not a scientific hypothesis (subjective, value-laden).

    • We should eat cupcakes: While related questions (health effects) can be studied, the statement itself is not a testable hypothesis.

    • Increases intelligence due to classical music: Testable and falsifiable via controlled comparisons of groups exposed to music vs. no music.

  • When constructing hypotheses, researchers start from a theory and derive a specific, testable statement about a particular situation that can be studied with measurable outcomes.

Descriptive, correlational, and experimental research

  • Descriptive research: describes “what’s going on” by asking people questions, collecting surveys, and summarizing data (e.g., average happiness on a scale).

  • Correlational research: investigates how one variable changes as another changes; assesses relationships but cannot establish causation.

    • Correlations can be positive (both variables increase together) or negative (one increases while the other decreases).

    • Used to predict how changes in one variable relate to changes in another, but not to establish mechanism.

  • Experimental research: establishes cause-and-effect by manipulating one or more independent variables and holding other variables constant.

    • Key goal: determine how changing a variable causes changes in another variable.

    • Methods for data collection: surveys, performance tests, observations, physiological measures (heart rate, blood pressure, hormones), or group-level observations.

  • Data types in research:

    • Quantitative data: numerical measurements (e.g., height in cm, weight, test scores, happiness on a 1–10 scale). Variables are defined in advance and measured consistently across participants.

    • Qualitative data: descriptive, narrative data that capture the quality or experience (e.g., interview transcripts, descriptions of feelings, themes emerging from responses).

  • Methodological complementarity:

    • Qualitative data can illuminate phenomena that are hard to measure quantitatively and can guide later quantitative research.

    • Quantitative data provide precise, comparable measurements and statistical analyses.

  • Often, researchers start with descriptive or qualitative work to understand a phenomenon and then move to quantitative methods to test hypotheses more rigorously.

Data types: quantitative vs qualitative

  • Quantitative methods:

    • Involve measurements that yield numbers (e.g., heights, scores, counts).

    • Require clearly defined variables and predefined measurement procedures for consistency across participants.

    • Produce numerical results and enable statistical analysis.

  • Qualitative methods:

    • Focus on the quality or nature of experiences, not reducible to simple numbers.

    • Produce narratives, descriptions, and themes (e.g., interviews about concert experiences).

    • Variables may emerge from the data rather than being predefined; analysis focuses on identifying patterns and themes.

  • Relationship between the two:

    • They are complementary; many studies use a mixed-methods approach, starting with qualitative data to identify themes and then collecting quantitative data to test those themes.

Experimental design: controlled manipulation and measurement

  • Illustration of an experiment (candy and happiness):

    • A small demonstration with volunteers to illustrate manipulation and measurement.

    • Independent variable: candy vs. no candy (the presence or absence of candy).

    • Dependent variable: happiness level (self-reported on a scale of 1 to 10).

    • Design issues in the demonstration:

    • Sample: 10 volunteers (n = 10).

    • Group assignment: 5 in the candy group and 5 in the no-candy group.

    • Random assignment: Ideally, participants would be assigned to groups by randomization (e.g., flipping a coin); in the demonstration it was not strictly random, described as a convenience/ad hoc setup.

    • Baseline measurement: Happiness was measured before manipulation to establish a baseline for each participant.

    • Baseline results: Participants showed varying baseline happiness (e.g., some reported around 6, others higher or lower), illustrating why a baseline helps interpret changes due to manipulation.

    • Post-manipulation measurement: Happiness was assessed after the manipulation to determine the effect of candy.

    • Potential confounds and limitations:

      • Volunteer bias: volunteers may be more motivated or have different preferences (e.g., some dislike candy).

      • Stage presence and the experimental setting may influence responses (demand characteristics, social desirability, excitement on stage).

      • Not all participants may respond the same way to candy due to individual differences.

      • The lack of strict randomization and other controls means the design is not a rigorous controlled experiment.

  • Core teaching point: A well-designed experiment carefully controls conditions, randomly assigns participants to groups, and manipulates one or more independent variables to observe effects on a dependent variable.

  • Key question in this example: What did we manipulate between conditions, and what variable did we observe changing?

    • Manipulated variable: whether a group received candy or not.

    • Observed variable: happiness level.

Practical guidance for evaluating scientific versus pseudoscientific claims

  • In scientific claims, researchers aim for falsifiability and replication; data are published with openness to review and criticism.

  • Pseudoscience often relies on anecdotes, selective data, and resistance to falsification or replication.

  • A strong scientific claim typically involves:

    • A testable hypothesis with defined measures.

    • A design that allows for potential falsification.

    • Methods and data that enable replication and verification by others.

  • Be mindful of how data are presented in media and online: look for whether claims reference controlled experiments, peer review, and whether alternative explanations have been considered.

Connections to foundational principles and real-world relevance

  • The process of scientific inquiry is cumulative and collaborative, building on prior work rather than relying on a single breakthrough.

  • Representativeness is crucial for external validity: conclusions drawn from a sample are most trustworthy when the sample closely matches the population of interest.

  • Inductive reasoning and uncertainty are inherent in scientific claims; researchers quantify the likelihood that results reflect true effects rather than chance.

  • Distinguishing descriptive, correlational, and experimental approaches helps researchers answer different kinds of questions:

    • Descriptive: What is happening?

    • Correlational: How are variables related?

    • Experimental: What causes what?

  • Ethical and practical implications include transparency, openness to replication, careful interpretation of generalizability, and avoidance of overclaiming from limited data.