DIS reading Wheelan-2013-CHAPTER9_Inference summary
Chapter 9: Inference
Introduction to Statistics and Personal Anecdote
The author shares an experience from college with a statistics professor.
Took statistics course reluctantly in exchange for a family trip to the USSR.
Initially struggled but improved through dedication and study.
Scored an A in the final exam, which raised suspicion from the professor about possible cheating.
Understanding Statistical Inference
Statistical inference relies on observing patterns and using probability for explanations.
Key Concept:
Statistics cannot prove anything conclusively; it determines likely explanations based on observed data.
Example: Unlikely Patterns
Hypothetical case of a gambler rolling a six repeatedly (ten times):
Probability of legit rolls: 1 in 60 million.
Possible explanations: luck vs. cheating.
Case of Linda Cooper, struck by lightning four times:
Statistical unlikelihood doesn’t negate insurance claims.
Characteristics of Statistical Inference
Statistical inference uses data to inform questions:
Questions like drug efficacy or health risks (e.g., cell phones and cancer).
Conclusion Drawing:
Cannot conclude drug effectiveness just because of varied outcomes.
Importance of understanding random variation in outcomes.
Examples of Inference Results
Hypothesis testing: determining if observed data is statistically significant.
Hypotheses in Research
Null Hypothesis (H0): Initial assumption (e.g., no drug effect).
Alternative Hypothesis (H1): Opposite of null, must be true if null is rejected.
Example:
H0: new drug has no effect on malaria.
H1: new drug reduces malaria.
Real-World Applications
Courtroom analogy: presumption of innocence as the null hypothesis.
Examples of medical research:
Comparing outcomes from treatment and placebo groups.
Common Result:
Statistically significant findings must undergo careful evaluation, including odds of results occurring by chance.
Significance Levels and Testing
Defining Significance Levels
5% significance level is a common threshold for rejecting the null hypothesis.
Less than 5% chance of observing results if H0 is true.
Examples of threshold levels:
.05 (standard), .01 (more stringent), .10 (less stringent).
The Process of Rejecting Null Hypothesis
Calculate p-value: Probability of observing data if H0 is true.
Determine if p-value meets the chosen significance level.
Example of Statistical Outcome
Atlanta standardized test erasure patterns suggesting misconduct.
High degrees of statistical variance led to rejection of H0 and acknowledgment of cheating.
Probability of outcomes as rare as found was extremely low without foul play.
Errors in Statistical Testing
Discussing Type I and Type II errors:
Type I Error: Rejecting a true null hypothesis (false positive).
Type II Error: Failing to reject a false null hypothesis (false negative).
Real-World Implications of Errors
Balancing thresholds in research and legal scenarios:
Risk of wrongly convicting innocent versus letting guilty go free.
Example: Cancer drug approval – implications of statistical thresholds in efficacy.
Conclusion: Importance of Statistical Inference
Recognized as a powerful tool, not infallible.
Allows reasonable interpretations of patterns in complex phenomena.
Distinction between correlation and causation remains critical:
Misinterpretations can lead to incorrect conclusions (e.g., bran muffin study).
Appendix: Calculating Standard Error
Formula for comparing two means.
Example dataset: Differing average brain volumes in autism research.
Inferential statistics raise essential questions in biological research.
Takeaway:
Statistical inference is necessary for drawing conclusions from research, serving as a framework to evaluate hypotheses in numerous fields.
Concept List:
Statistical Inference: The process of using data analysis to deduce properties of an underlying distribution of probability.
Null Hypothesis (H0): The hypothesis stating that there is no effect or difference, and any observed effect is due to sampling or experimental error.
Alternative Hypothesis (H1): The hypothesis that there is an effect or difference, and it is what researchers aim to support.
p-value: The probability of obtaining the observed results when the null hypothesis is true; used to determine statistical significance.
Type I Error: Incorrectly rejecting a true null hypothesis (false positive).
Type II Error: Failing to reject a false null hypothesis (false negative) and concluding that there is no effect when there is one.
Significance Level: A threshold in hypothesis testing, often set at 0.05, that determines the cutoff for rejecting the null hypothesis.