module 3 stat part 2

Overview of the Student t Distribution

  • The Student t distribution is foundational in statistics, particularly concerning the t test.

  • Focus on understanding its shape, properties, and parameters.

Degrees of Freedom

  • Definition: Degrees of freedom (df) indicate how many of the data points are flexible.

    • For continuous data: Example with 5 data points having a mean of 20.

      • 4 values can vary (indicated by vases), while the 5th is fixed to maintain the mean (e.g., 18).

      • Degrees of freedom in this case = 4.

    • For categorical data: Example of a 3x2 contingency table.

      • Knowing the column totals and row totals allows calculation of the free numbers.

      • Degrees of freedom here = 2.

      • Formula: (Number of Rows - 1) × (Number of Columns - 1).

T Test as a Parametric Test

  • The t test is assumed to be valid when data follows a normal distribution, particularly with large samples (n > 30).

  • Smaller sample sizes strictly follow a t distribution.

The Shape of the T Distribution

  • Compared to the normal distribution, the t distribution displays fatter tails:

    • Likelihood of extreme values (e.g., -3, -4) is higher in a t distribution.

  • Degrees of Freedom Effect:

    • The heaviness of the tails is influenced by degrees of freedom (df).

    • For the one sample t test, df = sample size - 1.

    • Larger sample sizes = larger df = t distribution resembles the normal distribution

    • After df=30, t distribution approximates the normal distribution closely.

Visual Representation of Distributions

  • T Distribution with df = 3: Noticeable differences in tails compared to normal distribution.

  • T Distribution with df = 10: Similar shape to normal distribution.

  • T Distribution with df = 30: Almost identical to the normal distribution, affirming the assumption that normal distribution can be used for large sample sizes.

Summary of Key Points

  • Defined degrees of freedom in both continuous and categorical contexts.

  • Analyzed properties of the t distribution, its relationship to degrees of freedom, and implications for statistical analysis.

  • Next topic entails practical analysis comparing means of two groups or datasets.