Notes on Independence in Contingency Tables and Simulation-based Inference
The Independence Model for Day of Week and Precipitation
Problem setup: two variables—day of the week (Weekday vs Weekend) and precipitation (Yes/No). If the variables are independent, the probability of precipitation is the same across the day categories.
Contingency table (theoretical under independence):
Total days: N=700; Weekdays: N{ ext{wkday}}=500; Weekends: N{ ext{wknd}}=200.
Overall probability of precipitation: p = 0.55 (as given by the example, consistent with the independence assumption).
Expected counts under independence (2x2 table):
E{ ext{wkday}, ext{Yes}} = rac{N{ ext{wkday}} imes (N_{ ext{Yes}})}{N} = rac{500 imes 384}{700} = 275
E{ ext{wkday}, ext{No}} = rac{500 imes (N{ ext{No}})}{N} = rac{500 imes 316}{700} = 225
E_{ ext{wknd}, ext{Yes}} = rac{200 imes 384}{700}
= 109.714…
ext{(approximately }110)
E_{ ext{wknd}, ext{No}} = rac{200 imes 316}{700}
= 90.285…
ext{(approximately }90)
Observed data from Lansing Airport (7:00 days sample):
- Weekdays with precipitation: 265; Weekdays without: 235
- Weekends with precipitation: 119; Weekends without: 81
- Totals: Weekdays 500, Weekends 200, Precipitation Yes 265+119=384, Precipitation No 316
Compare observed vs. theoretical:
- Weekday Yes difference: 265 - 275 = -10
- Weekend Yes difference: 119 - 110 = +9
- Total precipitation observed: 384 vs. theoretical total: 385 (since 275+110=385)
Probabilities to compare:
- Theoretical: P( ext{Yes} ext{ | Weekday}) = rac{275}{500} = 0.55, P( ext{Yes} ext{ | Weekend}) = rac{110}{200} = 0.55
- Observed: rac{265}{500} = 0.53 for Weekday, rac{119}{200} = 0.595 for Weekend
Key question: Are these observed differences due to natural variation or do they indicate dependence between day type and precipitation?
Conceptual model and visualization:
- Mosaic plots help assess independence: under independence, the height of bars (colors representing precipitation yes) are the same across day categories.
- If the observed mosaic shows a difference in the proportions, this hints at a potential association, but we need inference to decide if it’s due to chance.
Population vs. sample: the observed table is a sample from the population; we infer about whether the two variables are independent in the population from this sample.
Why this matters: even if there is a difference due to randomness, a larger or repeated sample could show similar or different patterns, which informs our conclusion about independence.
Connection to broader ideas:
- Independence testing for two categorical variables in 2x2 tables.
- The idea that even with independence, there will be sampling variability around the expected counts.
- The concept that a model (independence) is a simplification that can be tested against data.
The Mosaic Plot and How to Interpret It
- Definition: A mosaic plot displays a contingency table by partitioning a rectangle into tiles whose areas are proportional to cell probabilities or counts.
- Under independence: the fractions (heights/widths) corresponding to the conditional probabilities are the same across the rows and columns.
- In the precipitation example: the height of the blue bar for "Yes" precipitation should be the same whether the day is Weekday or Weekend if independence holds.
- If the observed plot shows different heights, that suggests dependence, but we still need a formal test to assess significance.
Real Data: Lansing Airport 700-Day Sample and Independence Assessment
- Data: 700 consecutive days; 500 Weekdays, 200 Weekends.
- Observed counts: Weekdays Yes = 265; Weekends Yes = 119; Weekdays No = 235; Weekends No = 81.
- Observed weekend precipitation probability: rac{119}{200} = 0.595, observed weekday probability: rac{265}{500} = 0.53.
- The independent model predicts constant precipitation probability across day types: p = 0.55; hence the expected weekend probability under independence is 0.55, which is close but not identical to the observed 0.595.
- Question raised: Are these small differences just due to natural variation, or do they indicate a real association between day of the week and precipitation?
- Concept of sampling variability: different samples of 700 days would yield somewhat different counts; a single sample does not prove independence or dependence.
- Next step: use simulations to assess how extreme the observed difference is under the independence model.
The Role of Simulation in Testing Independence
- Purpose: to determine whether the observed difference could plausibly arise by chance under the independence model.
- Approach (conceptual): generate many simulated contingency tables under independence (i.e., with the same row/column totals but with precipitation status allocated independently of day type), and see how often the simulated difference in proportions is as large as the observed difference.
- Mosaic-plot intuition: the observed difference in the real data is compared to the distribution of differences produced by the independence model.
- Key idea: the null hypothesis is that the two variables are independent; the alternative is that they are not.
Memory Experiment: Sleep vs. Caffeine and Memory Performance
- Experimental design: 24 students randomly assigned to two groups of 12 each.
- Group 1 (Sleep): group slept for 1.5 hours.
- Group 2 (Caffeine): group stayed awake and received a caffeine pill.
- Outcome: whether each student scored above 60% on a memory test.
- Sleep group: 7 of 12 scored >60%
- Caffeine group: 3 of 12 scored >60%
- Observed difference in rates:
- Sleep: rac{7}{12} \approx 0.583
- Caffeine: rac{3}{12} = 0.25
- Difference: rac{7}{12} - rac{3}{12} = rac{4}{12} = 0.333… \approx 0.33
- They phrase the difference as about 0.33, suggesting stronger memory performance with sleep than with caffeine.
- Goal: assess whether this observed difference could occur by chance if assignment to sleep vs caffeine has no effect on memory performance.
- Simulation setup (permutation test framework):
- Keep the total number of “successes” (scores >60%) fixed at 10 out of 24, regardless of group.
- Randomly reassign who is in Sleep vs Caffeine groups (12 in each) and recalculate the difference in success rates for each simulated dataset.
- Repeat for many simulations (e.g., 150 simulations in the example).
- Build the null distribution of the difference in proportions under the independence model.
- Compute the p-value as the proportion of simulated differences that are as extreme or more extreme than the observed difference (0.33).
- Visualization idea: use colored sticks to represent group assignments and outcomes in the simulation, illustrating how the permutation distributes under the null.
- Interpretation: if the p-value is small, we reject independence (that group assignment has no effect on memory performance) at the chosen significance level.
Hypotheses and Testing Procedure (General Framework)
- State the hypotheses:
- Null hypothesis (H0): The two variables are independent (no association). In experiments, this often translates to “treatment has no effect.”
- Alternative hypothesis (Ha): The two variables are not independent (there is an association).
- Use the sample to compute an observed statistic that measures the strength of association (e.g., difference in proportions for a 2x2 table).
- Generate the null distribution under the independence model via simulations or permutations.
- Compare the observed statistic to the null distribution to obtain a p-value:
- p ext{-value} = \frac{ ext{number of simulations with statistic as extreme or more extreme than observed}}{ ext{total number of simulations}}
- Decision rule: if the p-value is below the chosen significance level (e.g., \alpha=0.05), reject H0; otherwise, do not reject H0.
- Important caveats:
- The null is a model; it is a simplifying assumption that can be false in the population.
- Differences in sample can arise due to natural variation even when the null is true.
- Real-world factors (confounding variables) can produce apparent associations even when the primary relationship is absent.
Extensions and Real-World Cautions
- Confounding and context: Example given about past grades across different course types (e.g., advanced course vs English students in Peru) showing a reported 25.5% higher past grades difference; warns that observed differences may reflect underlying factors other than the treatment or primary variable.
- Practical implications: When evaluating effects (e.g., caffeine vs sleep on memory, or survey participation influenced by incentives), randomization and permutation-based inference help separate true effects from random fluctuations.
- Measurement and sampling concerns:
- The samples we study are samples of the population; repeated samples can yield different results due to random variation.
- An independence model is not proof; it is a model whose fit is assessed through likelihood of observed data under the model.
- Real-world experimental design considerations mentioned in the transcript:
- Studies move from observational ideas (contingency tables) toward controlled experiments with randomization to infer causal effects.
- Examples include caffeine and memory, surveys with gift cards to increase participation, and other related questions where independence testing and inference play a key role.
- Summary practical takeaway:
- To assess independence between two categorical variables, compare observed contingency tables with the expected counts under independence, often visualized via mosaic plots.
- When obvious deviations exist, use simulation-based or permutation-based tests to quantify the likelihood that such deviations occur under independence.
- Always consider alternative explanations such as sampling variability, confounding factors, and real-world context before concluding a true association.