Week 2 Notes: Research Questions and Ethics (STM 1,001)
Week 2: Research Questions and Ethics
Overview and context
- This week focuses on study design for the Science Health Stream: research questions and ethics.
- Readings introduce terminology around research questions and include a section on ethics; readings are accessible via the LMS under the Important Information tile.
- Ethics are fundamental to proper scientific research and should be integrated rather than ignored.
- In labs, readings will be referenced and questions will be embedded in activities.
Core message: start with clear, answerable research questions that align with data collection
- Questions should be reasonably answerable; otherwise data collection may be misaligned with the question.
- The type of data collected is determined by the research question.
- If a question is poorly formed, data collected might be inappropriate for the study.
- It can be challenging to design strong research questions; time spent on a solid foundation pays off in valid results.
- Often useful to break a broad research question into smaller sub-questions, potentially distinguishing between quantitative and qualitative approaches (multifaceted questions can have an overall objective met via several sub-questions).
Conceptual vs operational definitions
- Conceptual definition: articulates what is being measured or what the value represents in a theoretical sense.
- Operational definition: specifies how measurement is actually carried out (what instrument, where, how long, what procedures).
- Why definitions matter: clarity avoids ambiguity, especially in cross-language or cross-country contexts where translation can create confusion.
- Example: temperature
- Conceptual: what is temperature?
- Operational: how is temperature measured (thermometer type, placement, duration of measurement, units)?
- Real-world illustration: NASA’s experience with differing measurement systems (metric vs imperial) across teams and countries; lack of a clear, shared definition can cause component mismatch.
- In water-temperature examples: conceptual definition of temperature vs the concrete method of measuring temperature (thermometer placement, duration, etc.).
- Labs will provide practice with distinguishing and applying both definitions.
The POPI framework for research questions (Population, Outcome, Comparison/Connection, Intervention)
- Acronym often referred to as P, O, C, I (sometimes called POCI for the four components).
- Population (P): the group or set of units being studied; can be humans, objects, materials, etc. Population is not limited to people (e.g., bamboo flooring material produced in Queensland).
- Outcome (O): the result or variable of interest that will be measured or observed; typically a summary measure (mean, proportion, etc.). Not usually an outcome for an individual alone but a population parameter to be inferred.
- Comparison (C) or Connection (also called a relational component): a difference between groups (comparison) or a relationship between variables (connection).
- Intervention (I): an explicit condition introduced by the researchers to assess its effect on the outcome (common in experimental studies).
- If you have only Population + Outcome, the question is descriptive.
- If you add a Comparison or Connection, it becomes relational (you’re examining differences or associations).
- If you include an Intervention, you have an interventional research question (often with causal aims).
Detailed exploration of the four components
- Population (P)
- Not limited to people; can be any set of units of interest (e.g., rock samples, flooring materials, a specific animal population).
- The goal is to generalize findings from the sampled units to the overall population of interest.
- Analogy: sample mean as an estimator for the population mean, with hopes that results generalize to the population.
- Outcome (O)
- The metric you want to learn about; the result you summarize or test.
- Typical outcomes are quantitative (e.g., averages, proportions) but could be qualitative in some study designs.
- Important point: the outcome is about population characteristics, not a single individual's measurement.
- Comparison/Connection (C)
- Comparison: difference between two or more groups (e.g., wood vs bamboo flooring wear).
- Connection: a relationship between variables (e.g., caffeine intake and average heart rate).
- When the data are continuous (e.g., caffeine dose), the relationship can sometimes be framed as a comparison by creating subgroups (0, 1–2 cups, 3+ cups).
- Intervention (I)
- An explicit condition introduced by the researcher to provoke a change in the outcome (e.g., a drug, a training method, a new fertilizer, etc.).
- Examples mentioned in class include drug trials, different materials/treatments, controlled burns to reduce bushfires, or providing different incentives.
Types of research questions with examples
- Descriptive research question (P + O):
- Example: Among STM 1,001 students, what is the average height? (Note: described as an illustrative placeholder; the idea is to summarize a characteristic of the population.)
- Relational research question (P + O + C or P + O + Connection):
- Example with comparison: Among STM 1,001 students, is there a difference in average height between Science Health stream students and Data Science students?
- Example with connection: Is there a relationship between caffeine intake and average heart rate?
- Interventional research question (P + O + C + I):
- Example: Among STM 1,001 Science Health students, does learning method A (e.g., R) versus method B (e.g., Jamovi) affect learning outcomes? (An explicit intervention is introduced to evaluate its effect.)
Population and generalization concepts
- The population is the total set of units of interest; the sample is the subset observed.
- The goal is for sample results to generalize to the population, under appropriate sampling and study design.
- When sampling, consider if subpopulations exist and whether there are meaningful differences to examine (subsets for segmentation).
Data types and outcomes in practice
- Outcomes are typically expressed as a mean, proportion, or percentage, which summarize the population characteristic of interest.
- The outcome should describe a parameter of the population (e.g., average height, population mean μ, or proportion p).
- Notation and formula conventions (used below):
- Population mean:
\mu = E[X] = \frac{1}{N}\sum{i=1}^{N} xi - Sample mean as an estimator of μ:
\bar{x} = \frac{1}{n}\sum{i=1}^{n} xi - Expectation of the sample mean (unbiased estimator):
E[\bar{X}] = \mu - Population proportion:
- Sample proportion:
- Conceptual vs operational clarity helps ensure that the measures align with the question and can be implemented consistently across researchers and sites.
Practical notes on measurement, translation, and clarity
- Cross-language translation can introduce ambiguities; precise definitions mitigate misunderstanding across teams in different regions.
- Consistent units and measurement protocols prevent misalignment (e.g., NASA example with mixed measurement systems).
Examples discussed in the session
- Water temperature: distinction between a conceptual definition (what is temperature?) and an operational definition (how exactly is temperature measured, where is the thermometer placed, how long is it left, etc.).
- Flooring materials example: examining wear difference between standard wood flooring and bamboo flooring; a way to frame an engineering or quality-control question.
- Caffeine consumption and heart rate: converting a continuous exposure (cups of coffee) into subgroups (no coffee, 1–2 cups, 3+ cups) to form a comparison or to illustrate a connection.
Ethical considerations in research design
- Ethics are essential for reliability, trustworthiness, and reproducibility of results.
- Ethical risk categories to consider:
- Physical risks: potential for harm or injury to participants.
- Psychosocial risks: stress, stigma, or social harm.
- Social risks and environmental risks: impacts on communities or ecosystems.
- Confidentiality and data storage: how data are stored, who can access them, and long-term handling.
- Plagiarism and attribution: respecting intellectual property and proper credit.
- Explicit consent: when human subjects are involved, obtaining explicit informed consent is typically required; deception or withholding information is generally unethical.
- An example illustrating unethical practice: a video example where babies are separated at birth without parental knowledge to study outcomes; used to highlight why such design is unethical.
- How to assess ethics in a study: check potential benefits versus risks, inform participants about risks, ensure voluntary participation and the ability to withdraw, and apply appropriate data protection measures.
- The role of ethics in reproducibility: transparent methods enable other researchers to reproduce results; unethical design compromises reliability and trust.
In-session activities and exam context
- A Kahoot activity was used to practice identifying Population, Outcome, and Intervention in a concrete scenario (online vs face-to-face shopping among STM 1,001 students).
- Example discussed in the activity:
- Population: STM 1,001 students (the cohort under study in the class setting).
- Outcome: the average number of items bought online (or similar quantitative outcome).
- Intervention: not present in that particular scenario; demonstration focused on descriptive vs relational framing.
- Teachers emphasize not to worry about the exam yet; there will be revision sessions and practice exams in weeks 11–12; the exam contributes to 25% of the grade, with assignments contributing more (around 45%).
- Ethical discussion prompts included: what makes a research question ethical or unethical; a discussion about how to structure an ethics submission and how to distinguish benign research from potentially harmful setups.
Key takeaways to remember for exams and labs
- Always align the data type with the research question; avoid collecting data that do not support the intended outcomes.
- Use the POPI framework to structure research questions (Population, Outcome, Comparison/Connection, Intervention).
- Distinguish between descriptive (P + O), relational (P + O + C or P + O + Connection), and interventional (P + O + C + I) questions.
- Distinguish between conceptual and operational definitions; provide precise protocol details to avoid ambiguity.
- When in doubt, break complex questions into smaller sub-questions that can be addressed with clear data collection plans.
- Ethically evaluate the study design: minimize harm, protect confidentiality, obtain informed consent, and ensure the study design is able to yield trustworthy and reproducible results.
Quick recap of definitions and notations
- Population: the complete set of units of interest; the target for generalization.
- Sample: a subset of the population actually observed.
- Parameter: a numerical characteristic of the population (e.g., μ, p).
- Statistic: a numerical characteristic of the sample (e.g., x̄, p̂).
- Descriptive question: P + O.
- Relational question: P + O + C or P + O + Connection.
- Interventional question: P + O + C + I.
- Conceptual definition: abstract meaning of a concept.
- Operational definition: concrete method of measurement and data collection.
Suggested follow-ups for study planning
- Before collecting data, write down the four POPI elements for your proposed study.
- Draft both the conceptual and operational definitions for your key variables.
- Sketch how you would analyze the data (which summary statistics or models might be appropriate) to tie back to the chosen outcome and population.
- Create a simple ethics checklist for your study design, including potential risks, consent procedures, and data protection plans.
Reminder about readings and resources
- Readings are available on LMS; consult the Important Information tile for direct links.
- Labs will connect the readings to practical questions and assessment items.
Closing note
- The week emphasizes that strong research questions and ethical considerations are foundational to producing valid, reliable, and reproducible scientific results across disciplines.