1/9
This flashcard set covers key concepts and vocabulary associated with hypothesis testing in statistics.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Z statistic
A Z statistic (or Z-score) quantifies how many standard deviations an individual data point, or more commonly, a sample mean or sample proportion, is away from the population mean. It is a key component in hypothesis testing when the population standard deviation is known or for large sample proportion tests. To comprehend a statistical question using the Z statistic, you would:
Identify the population mean ($\mu$) and population standard deviation ($\sigma$) or the hypothesized population proportion ($p$).
Calculate the sample mean ($\bar{x}$) or sample proportion ($\hat{p}$) from your collected data.
Apply the Z-score formula:
For a sample mean: Z = (\bar{x} - \mu) / (\sigma / \sqrt{n})
For a sample proportion: Z = (\hat{p} - p) / \sqrt{p(1-p)/n}
Interpret the Z-score: A positive Z-score means the sample statistic is above the population mean/proportion, while a negative Z-score means it's below. The magnitude indicates how unusual the observation is.
P value
The P value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming that the null hypothesis is true. It helps determine the statistical significance of the observed results. To use a P value to answer a statistical question:
Calculate the test statistic (e.g., Z statistic) based on your sample data.
Determine the P value associated with your calculated test statistic. This typically involves looking up the Z-score in a standard normal (Z) table or using statistical software. The P value will depend on whether it's a one-tailed or two-tailed test.
Compare the P value to the chosen significance level ($\alpha$):
If P \text{ value } \le \alpha, you reject the null hypothesis, concluding there is sufficient evidence to support the alternative hypothesis.
If P \text{ value } > \alpha, you fail to reject the null hypothesis, concluding there is not enough evidence to support the alternative hypothesis. (Note: Failing to reject is not the same as accepting the null hypothesis).
Null hypothesis
The null hypothesis (denoted as H_0) is a statement of no effect, no difference, or no relationship between variables in the population. It represents the status quo or the assumption we are trying to test and potentially reject. It always includes an equality (e.g., =, \le, \ge). To formulate a null hypothesis for a question:
Identify the population parameter you are interested in (e.g., population mean ($\mu$), population proportion ($p$)).
State the claim or the existing belief about this parameter, always including an equality.
For example, if a company claims that the average weight of their product is 100g, the null hypothesis would be H0: \mu = 100g. If assuming a proportion is 50%, it's H0: p = 0.50.
This hypothesis serves as the baseline for your statistical test, and you will gather evidence to see if you can statistically refute it.
Alternative hypothesis
The alternative hypothesis (denoted as Ha or H1) is a statement that contradicts the null hypothesis. It represents the researcher's claim or what they are trying to prove—that there is an effect, a difference, or a relationship. It never includes an equality. To establish an alternative hypothesis for a statistical question:
Based on the research question or the direction of the expected effect, formulate a statement that is opposite to the null hypothesis.
Choose one of three forms for the alternative hypothesis:
Not equal to ($\ne$): For a two-tailed test, when you are simply looking for a difference in either direction (e.g., H_a: \mu \ne 100g).
Greater than ($>$): For a right-tailed test, when you expect the parameter to be larger than the null value (e.g., H_a: \mu > 100g).
Less than ($<$): For a left-tailed test, when you expect the parameter to be smaller than the null value (e.g., H_a: \mu < 100g).
The alternative hypothesis guides the type of test (one-tailed or two-tailed) and the interpretation of the results.
Right tail test
A right tail test (also known as an upper tail test) is a type of hypothesis test where the alternative hypothesis states that the population parameter is greater than a specified value. Consequently, the critical region (the area where you would reject the null hypothesis) is located entirely in the right (upper) tail of the sampling distribution. To solve a question using a right tail test:
Formulate your hypotheses: The alternative hypothesis will be in the form of Ha: \text{parameter} > \text{value}. The null hypothesis uses H0: \text{parameter} \le \text{value}.
Choose a significance level ($\alpha$) (e.g., 0.05).
Calculate the test statistic (e.g., Z statistic) from your sample data.
Find the critical value: Using your chosen $\alpha$, find the Z-score from the Z-table that corresponds to having $\alpha$ area in the upper tail. For example, if $\alpha = 0.05$, the critical Z-value is approximately +1.645.
Make a decision: If your calculated test statistic is greater than the critical value (Z{\text{calc}} > Z{\text{critical}}), or if your P value is less than or equal to $\alpha$, then you reject the null hypothesis. Otherwise, you fail to reject it.
Two tail test
A two tail test (or two-sided test) is used when the alternative hypothesis states that the population parameter is different from (not equal to) a specified value. This means that extreme results in either direction (significantly higher or significantly lower than the null hypothesis value) would lead to the rejection of the null hypothesis. Therefore, there are two critical regions, one in each tail of the sampling distribution. To solve a question using a two tail test:
Formulate your hypotheses: The alternative hypothesis will be in the form of Ha: \text{parameter} \ne \text{value}. The null hypothesis uses H0: \text{parameter} = \text{value}.
Choose a significance level ($\alpha$) (e.g., 0.05).
Divide the significance level by two: Since there are two tails, each tail will have an area of \alpha/2.
Calculate the test statistic (e.g., Z statistic) from your sample data.
Find the critical values: Using \alpha/2 for each tail, find the two Z-scores (one positive, one negative) from the Z-table that define the critical regions. For example, if $\alpha = 0.05$, then \alpha/2 = 0.025, and the critical Z-values are approximately \pm 1.96.
Make a decision: If the absolute value of your calculated test statistic is greater than the positive critical value (|Z{\text{calc}}| > |Z{\text{critical}}|, or if your P value is less than or equal to $\alpha$, then you reject the null hypothesis. Otherwise, you fail to reject it.
Significance level (alpha)
The significance level (denoted as $\alpha$) is a pre-determined threshold for statistical significance. It represents the maximum probability of making a Type I error, which is the error of rejecting a true null hypothesis. Commonly set values include 0.05 (5%), 0.01 (1%), or 0.10 (10%). To use the significance level in addressing a statistical question:
Select a value for $\alpha$ before conducting the hypothesis test. This choice reflects how much risk you're willing to take of incorrectly rejecting the null hypothesis.
It defines the critical region(s): For a given test, $\alpha$ establishes the boundary (critical value) beyond which the calculated test statistic (e.g., Z-score) is considered statistically significant enough to reject the null hypothesis.
It's used to make a decision based on the P value: After calculating the P value, you compare it to $\alpha$. If P \text{ value } \le \alpha, you reject H0. If P \text{ value } > \alpha, you fail to reject H0. This comparison directly answers whether your results are statistically significant at the chosen level.
Critical value
A critical value is a point on the sampling distribution of a test statistic (e.g., Z-score, t-score) that defines the threshold for rejecting the null hypothesis. It marks the boundary of the critical region (or rejection region). If the calculated test statistic falls within this critical region, the null hypothesis is rejected. To find and use a critical value to solve a statistical question:
Sample proportion (p hat)
The sample proportion (denoted as \hat{p}) is a statistic that represents the proportion of "successes" or observations with a specific characteristic within a sample. It is calculated as the number of successes divided by the total sample size. It serves as an estimate of the true population proportion ($p$). To calculate and use sample proportion in a statistical question:
Standard deviation (sigma)
The standard deviation (denoted as $\sigma$ for a population and $s$ for a sample) is a widely used measure of the amount of variation or dispersion of a set of values around the mean. A small standard deviation indicates that the data points tend to be close to the mean, while a large standard deviation indicates that the data points are spread out over a wider range of values. To utilize standard deviation in solving a statistical question: