The Student t distribution is foundational in statistics, particularly concerning the t test.
Focus on understanding its shape, properties, and parameters.
Definition: Degrees of freedom (df) indicate how many of the data points are flexible.
For continuous data: Example with 5 data points having a mean of 20.
4 values can vary (indicated by vases), while the 5th is fixed to maintain the mean (e.g., 18).
Degrees of freedom in this case = 4.
For categorical data: Example of a 3x2 contingency table.
Knowing the column totals and row totals allows calculation of the free numbers.
Degrees of freedom here = 2.
Formula: (Number of Rows - 1) × (Number of Columns - 1).
The t test is assumed to be valid when data follows a normal distribution, particularly with large samples (n > 30).
Smaller sample sizes strictly follow a t distribution.
Compared to the normal distribution, the t distribution displays fatter tails:
Likelihood of extreme values (e.g., -3, -4) is higher in a t distribution.
Degrees of Freedom Effect:
The heaviness of the tails is influenced by degrees of freedom (df).
For the one sample t test, df = sample size - 1.
Larger sample sizes = larger df = t distribution resembles the normal distribution
After df=30, t distribution approximates the normal distribution closely.
T Distribution with df = 3: Noticeable differences in tails compared to normal distribution.
T Distribution with df = 10: Similar shape to normal distribution.
T Distribution with df = 30: Almost identical to the normal distribution, affirming the assumption that normal distribution can be used for large sample sizes.
Defined degrees of freedom in both continuous and categorical contexts.
Analyzed properties of the t distribution, its relationship to degrees of freedom, and implications for statistical analysis.
Next topic entails practical analysis comparing means of two groups or datasets.