Standard Error and 95% Confidence Intervals
Probability and Frequency Distributions
Probability Theory: Fundamental to understanding variability and uncertainty. The sum of all probabilities in a distribution always equals . * Frequency Tables: In a histogram or frequency curve, the area under any specific part represents the number of observations or the probability of those values occurring. * Proportions and Percentages: Frequency distributions can be expressed as numbers, proportions, or percentages. The total area under a proportional curve sums to . * Central Tendency: Most data points cluster around the mean. Observations in the "tails" of the distribution (extremely high or low values) have a significantly lower probability of occurrence.
The Normal Distribution
Mathematical Properties: Normal distributions are perfectly symmetrical. The mean, median, and mode are identical.
Variance and Curve Shape: * Small variance/standard deviation results in a peaked distribution with data clustered tightly around the mean. * Large variance/standard deviation results in a flatter curve with higher probabilities in the tails.
Z-Scores: To calculate the probability of a specific value () being drawn from a population, the value is converted into a standard normal deviate (): * is the population mean; is the population standard deviation. * indicates how many standard deviations a value is from the mean. Values of are used with standardized statistical tables to find associated probabilities.
Standard Error of the Mean
Distribution of Means: If multiple random samples of size are drawn from a population, the sample means themselves form a normal distribution.
Effect of Sample Size: As the sample size () increases, the variance of the distribution of means decreases, meaning larger samples provide a more precise estimate of the population mean.
Standard Error (): The standard deviation of the distribution of sample means, calculated as:
Z-Deviate for Means: When testing sample means against a known population mean, the formula is:
The T-Distribution and Hypothesis Testing
T-Distribution: Used when the sample size is small or the true population standard deviation is unknown. It is more spread out than the -distribution, but its shape approaches the normal distribution as the degrees of freedom () increase.
Hypothesis Testing Steps: 1. Define Hypotheses: Establish the Null Hypothesis () and the Alternative Hypothesis (). 2. Calculate Test Statistic: Perform a -test using sample data: 3. Determine Significance: Compare the calculated against a critical value from -tables based on the alpha level (usually ) and degrees of freedom.
P-Values: * p < 0.05: The event is unlikely to occur by chance (< 5\% probability); the result is statistically significant, and is rejected. * p > 0.05: Results are not statistically significant.
One-tailed vs. Two-tailed Tests: A two-tailed test checks for any difference from the mean, while a one-tailed test checks for a difference in a specific direction (e.g., "shorter" or "longer").
95% Confidence Intervals
Definition: The range around a sample mean within which one is confident that the true population mean lies.
Calculation: The interval is defined by multiplying the standard error by the critical -value:
Example Case: For a sample mean of , (), and , the critical value is . The margin of error is . The expression is written as .
Probability Theory: Fundamental to understanding variability and uncertainty. The sum of all probabilities in a distribution always equals . Example: For a six-sided die, the probability of rolling a 3 is .
Frequency Tables: In a histogram or frequency curve, the area under any specific part represents the number of observations or the probability of those values occurring. Example: If there are 100 observations and 30 of them are in the range of 10-20, the frequency of that range is or .
Proportions and Percentages: Frequency distributions can be expressed as numbers, proportions, or percentages. The total area under a proportional curve sums to . Example: If you have a dataset of 200 students and 80 are male, the proportion of males is or .
Central Tendency: Most data points cluster around the mean. Observations in the "tails" of the distribution (extremely high or low values) have a significantly lower probability of occurrence. Example: If the mean of a data set is 50 and the standard deviation is 10, values less than 30 or greater than 70 are within the tails.
The Normal Distribution
Mathematical Properties: Normal distributions are perfectly symmetrical. The mean, median, and mode are identical. Example: For a normal distribution with a mean of 100 and a standard deviation of 15, approximately 68% of data falls within 85 and 115 (mean ± 1 standard deviation).
Variance and Curve Shape: - Small variance/standard deviation results in a peaked distribution with data clustered tightly around the mean. - Large variance/standard deviation results in a flatter curve with higher probabilities in the tails. Example: A distribution with a variance of 1 is more peaked than one with a variance of 25.
Z-Scores: To calculate the probability of a specific value () being drawn from a population, the value is converted into a standard normal deviate (): Example: For , , and , the calculation is .
Standard Error of the Mean
Standard Error (): The standard deviation of the distribution of sample means, calculated as: . Example: If the standard deviation of a sample is 20 and the sample size is 16, then .
Z-Deviate for Means: When testing sample means against a known population mean, the formula is: . Example: If sample mean , population mean , and , then .
The T-Distribution and Hypothesis Testing
Hypothesis Testing Steps: 1. Define Hypotheses: Establish the Null Hypothesis () and the Alternative Hypothesis (). Example: : The new teaching method is no better than the traditional one. 2. Calculate Test Statistic: Perform a -test using sample data: . Example: If , , and , then . 3. Determine Significance: Compare the calculated against a critical value from -tables based on the alpha level (usually ) and degrees of freedom.
95% Confidence Intervals
Calculation: The interval is defined by multiplying the standard error by the critical -value: . Example: If , , and critical , then CI = , resulting in the interval .