DIS reading Wheelan-2013-CHAPTER9_Inference summary

Chapter 9: Inference

Introduction to Statistics and Personal Anecdote

  • The author shares an experience from college with a statistics professor.

  • Took statistics course reluctantly in exchange for a family trip to the USSR.

  • Initially struggled but improved through dedication and study.

  • Scored an A in the final exam, which raised suspicion from the professor about possible cheating.

Understanding Statistical Inference

  • Statistical inference relies on observing patterns and using probability for explanations.

Key Concept:

  • Statistics cannot prove anything conclusively; it determines likely explanations based on observed data.

Example: Unlikely Patterns

  • Hypothetical case of a gambler rolling a six repeatedly (ten times):

    • Probability of legit rolls: 1 in 60 million.

    • Possible explanations: luck vs. cheating.

  • Case of Linda Cooper, struck by lightning four times:

    • Statistical unlikelihood doesn’t negate insurance claims.

Characteristics of Statistical Inference

  • Statistical inference uses data to inform questions:

    • Questions like drug efficacy or health risks (e.g., cell phones and cancer).

Conclusion Drawing:

  • Cannot conclude drug effectiveness just because of varied outcomes.

  • Importance of understanding random variation in outcomes.

Examples of Inference Results

  • Hypothesis testing: determining if observed data is statistically significant.

Hypotheses in Research

  • Null Hypothesis (H0): Initial assumption (e.g., no drug effect).

  • Alternative Hypothesis (H1): Opposite of null, must be true if null is rejected.

    • Example:

      • H0: new drug has no effect on malaria.

      • H1: new drug reduces malaria.

Real-World Applications

  • Courtroom analogy: presumption of innocence as the null hypothesis.

  • Examples of medical research:

    • Comparing outcomes from treatment and placebo groups.

Common Result:

  • Statistically significant findings must undergo careful evaluation, including odds of results occurring by chance.

Significance Levels and Testing

Defining Significance Levels
  • 5% significance level is a common threshold for rejecting the null hypothesis.

    • Less than 5% chance of observing results if H0 is true.

  • Examples of threshold levels:

    • .05 (standard), .01 (more stringent), .10 (less stringent).

The Process of Rejecting Null Hypothesis
  • Calculate p-value: Probability of observing data if H0 is true.

  • Determine if p-value meets the chosen significance level.

Example of Statistical Outcome

  • Atlanta standardized test erasure patterns suggesting misconduct.

  • High degrees of statistical variance led to rejection of H0 and acknowledgment of cheating.

  • Probability of outcomes as rare as found was extremely low without foul play.

Errors in Statistical Testing

  • Discussing Type I and Type II errors:

    • Type I Error: Rejecting a true null hypothesis (false positive).

    • Type II Error: Failing to reject a false null hypothesis (false negative).

Real-World Implications of Errors

  • Balancing thresholds in research and legal scenarios:

    • Risk of wrongly convicting innocent versus letting guilty go free.

    • Example: Cancer drug approval – implications of statistical thresholds in efficacy.

Conclusion: Importance of Statistical Inference

  • Recognized as a powerful tool, not infallible.

  • Allows reasonable interpretations of patterns in complex phenomena.

  • Distinction between correlation and causation remains critical:

    • Misinterpretations can lead to incorrect conclusions (e.g., bran muffin study).

Appendix: Calculating Standard Error

  • Formula for comparing two means.

  • Example dataset: Differing average brain volumes in autism research.

  • Inferential statistics raise essential questions in biological research.

Takeaway:

  • Statistical inference is necessary for drawing conclusions from research, serving as a framework to evaluate hypotheses in numerous fields.

Concept List:

  • Statistical Inference: The process of using data analysis to deduce properties of an underlying distribution of probability.

  • Null Hypothesis (H0): The hypothesis stating that there is no effect or difference, and any observed effect is due to sampling or experimental error.

  • Alternative Hypothesis (H1): The hypothesis that there is an effect or difference, and it is what researchers aim to support.

  • p-value: The probability of obtaining the observed results when the null hypothesis is true; used to determine statistical significance.

  • Type I Error: Incorrectly rejecting a true null hypothesis (false positive).

  • Type II Error: Failing to reject a false null hypothesis (false negative) and concluding that there is no effect when there is one.

  • Significance Level: A threshold in hypothesis testing, often set at 0.05, that determines the cutoff for rejecting the null hypothesis.

robot