Week 2 Notes: Research Questions and Ethics (STM 1,001)

Week 2: Research Questions and Ethics

  • Overview and context

    • This week focuses on study design for the Science Health Stream: research questions and ethics.
    • Readings introduce terminology around research questions and include a section on ethics; readings are accessible via the LMS under the Important Information tile.
    • Ethics are fundamental to proper scientific research and should be integrated rather than ignored.
    • In labs, readings will be referenced and questions will be embedded in activities.
  • Core message: start with clear, answerable research questions that align with data collection

    • Questions should be reasonably answerable; otherwise data collection may be misaligned with the question.
    • The type of data collected is determined by the research question.
    • If a question is poorly formed, data collected might be inappropriate for the study.
    • It can be challenging to design strong research questions; time spent on a solid foundation pays off in valid results.
    • Often useful to break a broad research question into smaller sub-questions, potentially distinguishing between quantitative and qualitative approaches (multifaceted questions can have an overall objective met via several sub-questions).
  • Conceptual vs operational definitions

    • Conceptual definition: articulates what is being measured or what the value represents in a theoretical sense.
    • Operational definition: specifies how measurement is actually carried out (what instrument, where, how long, what procedures).
    • Why definitions matter: clarity avoids ambiguity, especially in cross-language or cross-country contexts where translation can create confusion.
    • Example: temperature
    • Conceptual: what is temperature?
    • Operational: how is temperature measured (thermometer type, placement, duration of measurement, units)?
    • Real-world illustration: NASA’s experience with differing measurement systems (metric vs imperial) across teams and countries; lack of a clear, shared definition can cause component mismatch.
    • In water-temperature examples: conceptual definition of temperature vs the concrete method of measuring temperature (thermometer placement, duration, etc.).
    • Labs will provide practice with distinguishing and applying both definitions.
  • The POPI framework for research questions (Population, Outcome, Comparison/Connection, Intervention)

    • Acronym often referred to as P, O, C, I (sometimes called POCI for the four components).
    • Population (P): the group or set of units being studied; can be humans, objects, materials, etc. Population is not limited to people (e.g., bamboo flooring material produced in Queensland).
    • Outcome (O): the result or variable of interest that will be measured or observed; typically a summary measure (mean, proportion, etc.). Not usually an outcome for an individual alone but a population parameter to be inferred.
    • Comparison (C) or Connection (also called a relational component): a difference between groups (comparison) or a relationship between variables (connection).
    • Intervention (I): an explicit condition introduced by the researchers to assess its effect on the outcome (common in experimental studies).
    • If you have only Population + Outcome, the question is descriptive.
    • If you add a Comparison or Connection, it becomes relational (you’re examining differences or associations).
    • If you include an Intervention, you have an interventional research question (often with causal aims).
  • Detailed exploration of the four components

    • Population (P)
    • Not limited to people; can be any set of units of interest (e.g., rock samples, flooring materials, a specific animal population).
    • The goal is to generalize findings from the sampled units to the overall population of interest.
    • Analogy: sample mean as an estimator for the population mean, with hopes that results generalize to the population.
    • Outcome (O)
    • The metric you want to learn about; the result you summarize or test.
    • Typical outcomes are quantitative (e.g., averages, proportions) but could be qualitative in some study designs.
    • Important point: the outcome is about population characteristics, not a single individual's measurement.
    • Comparison/Connection (C)
    • Comparison: difference between two or more groups (e.g., wood vs bamboo flooring wear).
    • Connection: a relationship between variables (e.g., caffeine intake and average heart rate).
    • When the data are continuous (e.g., caffeine dose), the relationship can sometimes be framed as a comparison by creating subgroups (0, 1–2 cups, 3+ cups).
    • Intervention (I)
    • An explicit condition introduced by the researcher to provoke a change in the outcome (e.g., a drug, a training method, a new fertilizer, etc.).
    • Examples mentioned in class include drug trials, different materials/treatments, controlled burns to reduce bushfires, or providing different incentives.
  • Types of research questions with examples

    • Descriptive research question (P + O):
    • Example: Among STM 1,001 students, what is the average height? (Note: described as an illustrative placeholder; the idea is to summarize a characteristic of the population.)
    • Relational research question (P + O + C or P + O + Connection):
    • Example with comparison: Among STM 1,001 students, is there a difference in average height between Science Health stream students and Data Science students?
    • Example with connection: Is there a relationship between caffeine intake and average heart rate?
    • Interventional research question (P + O + C + I):
    • Example: Among STM 1,001 Science Health students, does learning method A (e.g., R) versus method B (e.g., Jamovi) affect learning outcomes? (An explicit intervention is introduced to evaluate its effect.)
  • Population and generalization concepts

    • The population is the total set of units of interest; the sample is the subset observed.
    • The goal is for sample results to generalize to the population, under appropriate sampling and study design.
    • When sampling, consider if subpopulations exist and whether there are meaningful differences to examine (subsets for segmentation).
  • Data types and outcomes in practice

    • Outcomes are typically expressed as a mean, proportion, or percentage, which summarize the population characteristic of interest.
    • The outcome should describe a parameter of the population (e.g., average height, population mean μ, or proportion p).
    • Notation and formula conventions (used below):
    • Population mean:
      \mu = E[X] = \frac{1}{N}\sum{i=1}^{N} xi
    • Sample mean as an estimator of μ:
      \bar{x} = \frac{1}{n}\sum{i=1}^{n} xi
    • Expectation of the sample mean (unbiased estimator):
      E[\bar{X}] = \mu
    • Population proportion: p=P(Y=1)p = P(Y=1)
    • Sample proportion: p^=1n<em>i=1nY</em>i,E[p^]=p\hat{p} = \frac{1}{n}\sum<em>{i=1}^{n} Y</em>i, \quad E[\hat{p}] = p
    • Conceptual vs operational clarity helps ensure that the measures align with the question and can be implemented consistently across researchers and sites.
  • Practical notes on measurement, translation, and clarity

    • Cross-language translation can introduce ambiguities; precise definitions mitigate misunderstanding across teams in different regions.
    • Consistent units and measurement protocols prevent misalignment (e.g., NASA example with mixed measurement systems).
  • Examples discussed in the session

    • Water temperature: distinction between a conceptual definition (what is temperature?) and an operational definition (how exactly is temperature measured, where is the thermometer placed, how long is it left, etc.).
    • Flooring materials example: examining wear difference between standard wood flooring and bamboo flooring; a way to frame an engineering or quality-control question.
    • Caffeine consumption and heart rate: converting a continuous exposure (cups of coffee) into subgroups (no coffee, 1–2 cups, 3+ cups) to form a comparison or to illustrate a connection.
  • Ethical considerations in research design

    • Ethics are essential for reliability, trustworthiness, and reproducibility of results.
    • Ethical risk categories to consider:
    • Physical risks: potential for harm or injury to participants.
    • Psychosocial risks: stress, stigma, or social harm.
    • Social risks and environmental risks: impacts on communities or ecosystems.
    • Confidentiality and data storage: how data are stored, who can access them, and long-term handling.
    • Plagiarism and attribution: respecting intellectual property and proper credit.
    • Explicit consent: when human subjects are involved, obtaining explicit informed consent is typically required; deception or withholding information is generally unethical.
    • An example illustrating unethical practice: a video example where babies are separated at birth without parental knowledge to study outcomes; used to highlight why such design is unethical.
    • How to assess ethics in a study: check potential benefits versus risks, inform participants about risks, ensure voluntary participation and the ability to withdraw, and apply appropriate data protection measures.
    • The role of ethics in reproducibility: transparent methods enable other researchers to reproduce results; unethical design compromises reliability and trust.
  • In-session activities and exam context

    • A Kahoot activity was used to practice identifying Population, Outcome, and Intervention in a concrete scenario (online vs face-to-face shopping among STM 1,001 students).
    • Example discussed in the activity:
    • Population: STM 1,001 students (the cohort under study in the class setting).
    • Outcome: the average number of items bought online (or similar quantitative outcome).
    • Intervention: not present in that particular scenario; demonstration focused on descriptive vs relational framing.
    • Teachers emphasize not to worry about the exam yet; there will be revision sessions and practice exams in weeks 11–12; the exam contributes to 25% of the grade, with assignments contributing more (around 45%).
    • Ethical discussion prompts included: what makes a research question ethical or unethical; a discussion about how to structure an ethics submission and how to distinguish benign research from potentially harmful setups.
  • Key takeaways to remember for exams and labs

    • Always align the data type with the research question; avoid collecting data that do not support the intended outcomes.
    • Use the POPI framework to structure research questions (Population, Outcome, Comparison/Connection, Intervention).
    • Distinguish between descriptive (P + O), relational (P + O + C or P + O + Connection), and interventional (P + O + C + I) questions.
    • Distinguish between conceptual and operational definitions; provide precise protocol details to avoid ambiguity.
    • When in doubt, break complex questions into smaller sub-questions that can be addressed with clear data collection plans.
    • Ethically evaluate the study design: minimize harm, protect confidentiality, obtain informed consent, and ensure the study design is able to yield trustworthy and reproducible results.
  • Quick recap of definitions and notations

    • Population: the complete set of units of interest; the target for generalization.
    • Sample: a subset of the population actually observed.
    • Parameter: a numerical characteristic of the population (e.g., μ, p).
    • Statistic: a numerical characteristic of the sample (e.g., x̄, p̂).
    • Descriptive question: P + O.
    • Relational question: P + O + C or P + O + Connection.
    • Interventional question: P + O + C + I.
    • Conceptual definition: abstract meaning of a concept.
    • Operational definition: concrete method of measurement and data collection.
  • Suggested follow-ups for study planning

    • Before collecting data, write down the four POPI elements for your proposed study.
    • Draft both the conceptual and operational definitions for your key variables.
    • Sketch how you would analyze the data (which summary statistics or models might be appropriate) to tie back to the chosen outcome and population.
    • Create a simple ethics checklist for your study design, including potential risks, consent procedures, and data protection plans.
  • Reminder about readings and resources

    • Readings are available on LMS; consult the Important Information tile for direct links.
    • Labs will connect the readings to practical questions and assessment items.
  • Closing note

    • The week emphasizes that strong research questions and ethical considerations are foundational to producing valid, reliable, and reproducible scientific results across disciplines.