chp1

Chapter 1: Introduction to Data

  • Course material modified from slides developed by Mine C¸ etinkaya-Rundel of OpenIntro.

  • Slides can be copied, edited, and shared under the CC BY-SA license.

  • Certain images are included under fair use for educational purposes.

Case Study

Treating Chronic Fatigue Syndrome

  • Objective: Evaluate the effectiveness of cognitive-behavior therapy for chronic fatigue syndrome.

  • Participant Pool: 142 patients recruited from referrals by primary care and consultants.

  • Actual Participants: Only 60 patients entered the study; exclusions included not meeting diagnostic criteria, other health issues, and refusals.

  • Reference: Deale et. al. Cognitive behavior therapy for chronic fatigue syndrome: A randomized controlled trial. The American Journal of Psychiatry 154.3 (1997).

Study Design

  • Patients randomly assigned to:

    • Treatment Group: Cognitive behavior therapy focusing on collaboration, education, and behavior change to safely increase activity.

    • Control Group: Relaxation techniques without advice to increase activity (e.g., muscle relaxation, visualization).

Results

  • Follow-up Outcomes: 7 patients dropped out (3 from treatment, 4 from control).

  • Good Outcome Distribution:

    • Treatment: 19 Yes, 8 No (Total 27)

    • Control: 5 Yes, 21 No (Total 26)

  • Proportions with Good Outcomes:

    • Treatment: 19/27 ≈ 70%

    • Control: 5/26 ≈ 19%

Understanding the Results

  • Real Difference Evaluation:

    • Example of coin flips: expected natural variation.

    • Difference between groups (70% - 19% = 51%) may be real or due to variation.

    • Statistical tools needed to validate the difference as beyond chance.

Generalizing the Results

  • Generalizability Concern:

    • Participants had specific characteristics leading to potential bias.

    • Results cannot be generalized universally yet are promising for a specific subgroup.

Data Basics

Classroom Survey

  • A survey on statistics students included questions on:

    • Gender

    • Introversion/Extraversion

    • Average sleep hours

    • Bedtime

    • Number of countries visited

    • Dread level (on a 1-5 scale)

Data Matrix

  • Example data collected from students on various demographic and psychological variables.

Types of Variables

  • Numerical: Continuous (e.g., hours of sleep) & Discrete.

  • Categorical: Regular & Ordinal (e.g., gender, dread scale).

Practice Question

  • Example of variable categorization:Is a telephone area code numerical or categorical?

Relationships Among Variables

  • Correlation & Data Points:

    • Examines relationship between GPA and study hours.

    • Notably, a GPA > 4.0 is an anomaly.

Explanatory and Response Variables

  • Variable Classification:

    • Identifying which variable affects the other.

  • Caution Against Assumptions:

    • Correlation does not imply causation.

Types of Data Collection

  • Observational Studies:

    • Data collected without interference.

    • Cannot establish causality but can identify associations.

  • Experiments:

    • Random assignment of subjects to treatments to establish causal relations.

Association vs. Causation

  • Clarifies difference between associations (dependent variables) and independence.

Experimental Design Principles

  • Control Variables: Mitigate other effects.

  • Randomization: Necessary for assigning treatments and sampling.

  • Replication: Gather large enough samples for validity.

  • Blocking: Group subjects by known variables prior to random assignment.

Experimental Design Example

  • Testing energy gels on runners by blocking for pro/amateur status for better control of results.

Additional Experimental Terms

  • Placebo: Fake treatment for control groups.

  • Placebo Effect: Improvement due to belief in treatment.

  • Blinding: Participants unaware of their group assignments (double-blinding includes researchers).

Practice Question: Observational vs. Experimental Studies

  • Key differentiation based on random assignment for experiments.

Conclusion

  • Each section highlights crucial aspects of data analysis, variable classification, experiment design, and the importance of well-structured studies in drawing valid conclusions.

what is the process of statistical investigation? Identify a question or problem, collect relevant data, analyze the data, form a conclusion.