Chapter 9: Inference
The author shares an experience from college with a statistics professor.
Took statistics course reluctantly in exchange for a family trip to the USSR.
Initially struggled but improved through dedication and study.
Scored an A in the final exam, which raised suspicion from the professor about possible cheating.
Statistical inference relies on observing patterns and using probability for explanations.
Statistics cannot prove anything conclusively; it determines likely explanations based on observed data.
Hypothetical case of a gambler rolling a six repeatedly (ten times):
Probability of legit rolls: 1 in 60 million.
Possible explanations: luck vs. cheating.
Case of Linda Cooper, struck by lightning four times:
Statistical unlikelihood doesn’t negate insurance claims.
Statistical inference uses data to inform questions:
Questions like drug efficacy or health risks (e.g., cell phones and cancer).
Cannot conclude drug effectiveness just because of varied outcomes.
Importance of understanding random variation in outcomes.
Hypothesis testing: determining if observed data is statistically significant.
Null Hypothesis (H0): Initial assumption (e.g., no drug effect).
Alternative Hypothesis (H1): Opposite of null, must be true if null is rejected.
Example:
H0: new drug has no effect on malaria.
H1: new drug reduces malaria.
Courtroom analogy: presumption of innocence as the null hypothesis.
Examples of medical research:
Comparing outcomes from treatment and placebo groups.
Statistically significant findings must undergo careful evaluation, including odds of results occurring by chance.
5% significance level is a common threshold for rejecting the null hypothesis.
Less than 5% chance of observing results if H0 is true.
Examples of threshold levels:
.05 (standard), .01 (more stringent), .10 (less stringent).
Calculate p-value: Probability of observing data if H0 is true.
Determine if p-value meets the chosen significance level.
Atlanta standardized test erasure patterns suggesting misconduct.
High degrees of statistical variance led to rejection of H0 and acknowledgment of cheating.
Probability of outcomes as rare as found was extremely low without foul play.
Discussing Type I and Type II errors:
Type I Error: Rejecting a true null hypothesis (false positive).
Type II Error: Failing to reject a false null hypothesis (false negative).
Balancing thresholds in research and legal scenarios:
Risk of wrongly convicting innocent versus letting guilty go free.
Example: Cancer drug approval – implications of statistical thresholds in efficacy.
Recognized as a powerful tool, not infallible.
Allows reasonable interpretations of patterns in complex phenomena.
Distinction between correlation and causation remains critical:
Misinterpretations can lead to incorrect conclusions (e.g., bran muffin study).
Formula for comparing two means.
Example dataset: Differing average brain volumes in autism research.
Inferential statistics raise essential questions in biological research.
Statistical inference is necessary for drawing conclusions from research, serving as a framework to evaluate hypotheses in numerous fields.
Statistical Inference: The process of using data analysis to deduce properties of an underlying distribution of probability.
Null Hypothesis (H0): The hypothesis stating that there is no effect or difference, and any observed effect is due to sampling or experimental error.
Alternative Hypothesis (H1): The hypothesis that there is an effect or difference, and it is what researchers aim to support.
p-value: The probability of obtaining the observed results when the null hypothesis is true; used to determine statistical significance.
Type I Error: Incorrectly rejecting a true null hypothesis (false positive).
Type II Error: Failing to reject a false null hypothesis (false negative) and concluding that there is no effect when there is one.
Significance Level: A threshold in hypothesis testing, often set at 0.05, that determines the cutoff for rejecting the null hypothesis.