Standard Error and 95% Confidence Intervals

Probability Theory: Fundamental to understanding variability and uncertainty. The sum of all probabilities in a distribution always equals $1$ . * Frequency Tables: In a histogram or frequency curve, the area under any specific part represents the number of observations or the probability of those values occurring. * Proportions and Percentages: Frequency distributions can be expressed as numbers, proportions, or percentages. The total area under a proportional curve sums to $1$ . * Central Tendency: Most data points cluster around the mean. Observations in the "tails" of the distribution (extremely high or low values) have a significantly lower probability of occurrence.

Mathematical Properties: Normal distributions are perfectly symmetrical. The mean, median, and mode are identical.
Variance and Curve Shape: * Small variance/standard deviation results in a peaked distribution with data clustered tightly around the mean. * Large variance/standard deviation results in a flatter curve with higher probabilities in the tails.
Z-Scores: To calculate the probability of a specific value ( $x$ ) being drawn from a population, the value is converted into a standard normal deviate ( $z$ ): $z = \frac{x - \mu}{\sigma}$ * $\mu$ is the population mean; $\sigma$ is the population standard deviation. * $z$ indicates how many standard deviations a value is from the mean. Values of $z$ are used with standardized statistical tables to find associated probabilities.

Distribution of Means: If multiple random samples of size $n$ are drawn from a population, the sample means themselves form a normal distribution.
Effect of Sample Size: As the sample size ( $n$ ) increases, the variance of the distribution of means decreases, meaning larger samples provide a more precise estimate of the population mean.
Standard Error ( $s_{\bar{x}}$ ): The standard deviation of the distribution of sample means, calculated as: $SE = \frac{s}{\sqrt{n}}$
Z-Deviate for Means: When testing sample means against a known population mean, the formula is: $z = \frac{\bar{x} - \mu}{SE}$

T-Distribution: Used when the sample size is small or the true population standard deviation is unknown. It is more spread out than the $z$ -distribution, but its shape approaches the normal distribution as the degrees of freedom ( $v = n - 1$ ) increase.
Hypothesis Testing Steps: 1. Define Hypotheses: Establish the Null Hypothesis ( $H_0$ ) and the Alternative Hypothesis ( $H_1$ ). 2. Calculate Test Statistic: Perform a $t$ -test using sample data: $t = \frac{\bar{x} - \mu}{s_{\bar{x}}}$ 3. Determine Significance: Compare the calculated $t$ against a critical value from $t$ -tables based on the alpha level (usually $0.05$ ) and degrees of freedom.
P-Values: * p < 0.05: The event is unlikely to occur by chance (< 5\% probability); the result is statistically significant, and $H_0$ is rejected. * p > 0.05: Results are not statistically significant.
One-tailed vs. Two-tailed Tests: A two-tailed test checks for any difference from the mean, while a one-tailed test checks for a difference in a specific direction (e.g., "shorter" or "longer").

Definition: The range around a sample mean within which one is $95\%$ confident that the true population mean lies.
Calculation: The interval is defined by multiplying the standard error by the critical $t$ -value: $\text{Confidence Interval} = \bar{x} \pm (s_{\bar{x}} \times t_{v, \alpha/2})$
Example Case: For a sample mean of $42.3\,mm$ , $n = 26$ ( $v = 25$ ), and $SE = 2.15$ , the critical $t$ value is $2.06$ . The margin of error is $2.15 \times 2.06 = 4.43$ . The expression is written as $42.3\,mm \pm 4.43\,mm$ .

Probability Theory: Fundamental to understanding variability and uncertainty. The sum of all probabilities in a distribution always equals $1$ . Example: For a six-sided die, the probability of rolling a 3 is $P(X=3) = \frac{1}{6}$ .
Frequency Tables: In a histogram or frequency curve, the area under any specific part represents the number of observations or the probability of those values occurring. Example: If there are 100 observations and 30 of them are in the range of 10-20, the frequency of that range is $\frac{30}{100} = 0.3$ or $30 ext{\%}$ .
Proportions and Percentages: Frequency distributions can be expressed as numbers, proportions, or percentages. The total area under a proportional curve sums to $1$ . Example: If you have a dataset of 200 students and 80 are male, the proportion of males is $\frac{80}{200} = 0.4$ or $40 ext{\%}$ .
Central Tendency: Most data points cluster around the mean. Observations in the "tails" of the distribution (extremely high or low values) have a significantly lower probability of occurrence. Example: If the mean of a data set is 50 and the standard deviation is 10, values less than 30 or greater than 70 are within the tails.

Mathematical Properties: Normal distributions are perfectly symmetrical. The mean, median, and mode are identical. Example: For a normal distribution with a mean of 100 and a standard deviation of 15, approximately 68% of data falls within 85 and 115 (mean ± 1 standard deviation).
Variance and Curve Shape: - Small variance/standard deviation results in a peaked distribution with data clustered tightly around the mean. - Large variance/standard deviation results in a flatter curve with higher probabilities in the tails. Example: A distribution with a variance of 1 is more peaked than one with a variance of 25.
Z-Scores: To calculate the probability of a specific value ( $x$ ) being drawn from a population, the value is converted into a standard normal deviate ( $z$ ): $z = \frac{x - \mu}{\sigma}$ Example: For $x = 130$ , $\mu = 100$ , and $\sigma = 15$ , the calculation is $z = \frac{130 - 100}{15} = 2$ .

Standard Error ( $s_{\bar{x}}$ ): The standard deviation of the distribution of sample means, calculated as: $SE = \frac{s}{\sqrt{n}}$ . Example: If the standard deviation of a sample is 20 and the sample size is 16, then $SE = \frac{20}{\sqrt{16}} = 5$ .
Z-Deviate for Means: When testing sample means against a known population mean, the formula is: $z = \frac{\bar{x} - \mu}{SE}$ . Example: If sample mean $\bar{x} = 22$ , population mean $\mu = 20$ , and $SE = 5$ , then $z = \frac{22 - 20}{5} = 0.4$ .

Hypothesis Testing Steps: 1. Define Hypotheses: Establish the Null Hypothesis ( $H_0$ ) and the Alternative Hypothesis ( $H_1$ ). Example: $H_0$ : The new teaching method is no better than the traditional one. 2. Calculate Test Statistic: Perform a $t$ -test using sample data: $t = \frac{\bar{x} - \mu}{s_{\bar{x}}}$ . Example: If $\bar{x} = 50$ , $\mu = 48$ , and $s_{\bar{x}} = 2$ , then $t = \frac{50 - 48}{2} = 1$ . 3. Determine Significance: Compare the calculated $t$ against a critical value from $t$ -tables based on the alpha level (usually $0.05$ ) and degrees of freedom.

Calculation: The interval is defined by multiplying the standard error by the critical $t$ -value: $\text{Confidence Interval} = \bar{x} \pm (s_{\bar{x}} \times t_{v, \alpha/2})$ . Example: If $\bar{x} = 42.3$ , $SE = 2.15$ , and critical $t = 2.06$ , then CI = $42.3 \pm (2.15 \times 2.06) = 42.3 \pm 4.43$ , resulting in the interval $[37.87, 46.73]$ .