The goal of this chapter is to explore and establish causal relationships between variables, which is fundamental in understanding phenomena within life and health sciences.
This chapter shifts from analyzing single variables within one population to examining two or more variables across different populations. The multivariable focus allows for deeper insights into how different factors interact.
Important Concepts:
Statistical Significance: Refers to the likelihood that a result or relationship is caused by something other than mere random chance.
Practical Significance: Addresses whether the statistical significance has real-world relevance.
Relationships and Associations: Understanding both the correlations and the underlying reasons behind observed differences is crucial.
Causal relationship determination is emphasized as a key objective.
Preliminary Tool: The Contingency Table:
Univariate Data: Data from one variable, helpful in providing a singular perspective.
Bivariate Data: Data from two variables, visualized in a contingency table (two-way table) that displays joint occurrences. Each cell contains a joint event, allowing for the assessment of correlations.
Joint Probabilities: The probabilities of joint events occurring together.
Marginal Probabilities: The probabilities of individual events, obtained from the margins of the table.
Investigates relationships between happiness and family income levels.
Demonstrates effective use of contingency tables for analysis.
Questions to analyze contingency table data include:(a) Total number of cells: 9 (3x3)(b) Proportion of Americans identifying as very happy: Approximately 26.2% (0.262)(c) Proportion of Americans with above-average income: Approximately 20.44% (0.2044)(d) Proportion of both very happy and above average income: Approximately 6.9% (0.069)
Causation is a pivotal domain of scientific inquiry; relevant examples include:
The strong relationship between smoking and lung cancer.
Observing correlations between traits such as hair color and eye color.
Analyzing data helps determine whether a relationship is causal or simply associative.
Focus on proportions given specific happiness statuses:(a) Very Happy with Above Average income: Approximately 26.42% (0.2642)(b) Not Too Happy with Above Average income: Approximately 9.87% (0.0987)
Suggests a relationship exists between perceived happiness and income levels.
Definition: Association indicates a relationship between two variables, revealing potential causal inferences.
Response Variable: The dependent variable that is of interest.
Explanatory Variables: Independent variables that influence the response variable.
Example: Studying may correlate with high test scores, but failure to study does not guarantee low scores.
Establishing causation requires more than mere association; comprehensive experimental design is essential.
Example: The misleading association between ice cream sales and violent crime can stem from a confounding variable, such as warm weather, which influences both quantities.
Definition: A confounding variable is one that is linked to both response and explanatory variables, complicating the clarity of causation.
Reverse correlations may occur within associations where causation follows a logical sequence (i.e., "if...then").
Examples include smoking being the cause and lung cancer the effect, reaffirming the one-directional nature of causation.
Observational Study: The analyst observes outcomes without manipulating any variables, which limits conclusions about causation.
Designed Study: Involves controlling treatments and assignments to better determine relationships, thereby establishing causality more reliably.
Experimental Units: The subjects being tested in an experiment.
Definitions:
Factors: Variables that could impact the response variable.
Levels: Specific values within those factors.
Treatments: Combinations of factor levels applied in experiments.
Balances potential confounding variables through comparative studies, ensuring that all treatments receive equal exposure opportunities.
Completely Randomized Design: Assigns subjects randomly to treatment levels, often including a control group receiving a placebo for more robust comparisons.
In a classroom experiment, selecting control and experimental groups to assess interventions.
Designed to reduce effects of latent variables through pairing similar experimental units.
Example: Evaluating driving performance of students using a driving simulator while distracted by a cell phone.
A block design acknowledges known similarities among subjects, ensuring that experiments accurately distribute treatments across blocks to avoid confounding.
Example: Advertising Efficacy Study — highlights the importance of separating populations (e.g., children vs adults) to minimize variable interference when measuring ad responses.
Placebo Effect: Placebo can significantly alter responses based on expectations rather than actual effectiveness.
Unconscious Bias: Survey inputs can be distorted by the evaluator's biases.
Distinguishing between statistical significance (the result observed) versus practical significance (the real-world implications).
Challenges include issues such as sample refusals, non-compliance, and participant dropouts, which can impact the integrity of study results.