Statistics Hypothesis Testing Review

0.0(0)
studied byStudied by 0 people
0.0(0)
linked notesView linked note
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/9

flashcard set

Earn XP

Description and Tags

This flashcard set covers key concepts and vocabulary associated with hypothesis testing in statistics.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

10 Terms

1
New cards

Z statistic

A Z statistic (or Z-score) quantifies how many standard deviations an individual data point, or more commonly, a sample mean or sample proportion, is away from the population mean. It is a key component in hypothesis testing when the population standard deviation is known or for large sample proportion tests. To comprehend a statistical question using the Z statistic, you would:

  1. Identify the population mean ($\mu$) and population standard deviation ($\sigma$) or the hypothesized population proportion ($p$).

  2. Calculate the sample mean ($\bar{x}$) or sample proportion ($\hat{p}$) from your collected data.

  3. Apply the Z-score formula:

    • For a sample mean: Z = (\bar{x} - \mu) / (\sigma / \sqrt{n})

    • For a sample proportion: Z = (\hat{p} - p) / \sqrt{p(1-p)/n}

  4. Interpret the Z-score: A positive Z-score means the sample statistic is above the population mean/proportion, while a negative Z-score means it's below. The magnitude indicates how unusual the observation is.

2
New cards

P value

The P value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming that the null hypothesis is true. It helps determine the statistical significance of the observed results. To use a P value to answer a statistical question:

  1. Calculate the test statistic (e.g., Z statistic) based on your sample data.

  2. Determine the P value associated with your calculated test statistic. This typically involves looking up the Z-score in a standard normal (Z) table or using statistical software. The P value will depend on whether it's a one-tailed or two-tailed test.

  3. Compare the P value to the chosen significance level ($\alpha$):

    • If P \text{ value } \le \alpha, you reject the null hypothesis, concluding there is sufficient evidence to support the alternative hypothesis.

    • If P \text{ value } > \alpha, you fail to reject the null hypothesis, concluding there is not enough evidence to support the alternative hypothesis. (Note: Failing to reject is not the same as accepting the null hypothesis).

3
New cards

Null hypothesis

The null hypothesis (denoted as H_0) is a statement of no effect, no difference, or no relationship between variables in the population. It represents the status quo or the assumption we are trying to test and potentially reject. It always includes an equality (e.g., =, \le, \ge). To formulate a null hypothesis for a question:

  1. Identify the population parameter you are interested in (e.g., population mean ($\mu$), population proportion ($p$)).

  2. State the claim or the existing belief about this parameter, always including an equality.

    • For example, if a company claims that the average weight of their product is 100g, the null hypothesis would be H0: \mu = 100g. If assuming a proportion is 50%, it's H0: p = 0.50.

  3. This hypothesis serves as the baseline for your statistical test, and you will gather evidence to see if you can statistically refute it.

4
New cards

Alternative hypothesis

The alternative hypothesis (denoted as Ha or H1) is a statement that contradicts the null hypothesis. It represents the researcher's claim or what they are trying to prove—that there is an effect, a difference, or a relationship. It never includes an equality. To establish an alternative hypothesis for a statistical question:

  1. Based on the research question or the direction of the expected effect, formulate a statement that is opposite to the null hypothesis.

  2. Choose one of three forms for the alternative hypothesis:

    • Not equal to ($\ne$): For a two-tailed test, when you are simply looking for a difference in either direction (e.g., H_a: \mu \ne 100g).

    • Greater than ($>$): For a right-tailed test, when you expect the parameter to be larger than the null value (e.g., H_a: \mu > 100g).

    • Less than ($<$): For a left-tailed test, when you expect the parameter to be smaller than the null value (e.g., H_a: \mu < 100g).

  3. The alternative hypothesis guides the type of test (one-tailed or two-tailed) and the interpretation of the results.

5
New cards

Right tail test

A right tail test (also known as an upper tail test) is a type of hypothesis test where the alternative hypothesis states that the population parameter is greater than a specified value. Consequently, the critical region (the area where you would reject the null hypothesis) is located entirely in the right (upper) tail of the sampling distribution. To solve a question using a right tail test:

  1. Formulate your hypotheses: The alternative hypothesis will be in the form of Ha: \text{parameter} > \text{value}. The null hypothesis uses H0: \text{parameter} \le \text{value}.

  2. Choose a significance level ($\alpha$) (e.g., 0.05).

  3. Calculate the test statistic (e.g., Z statistic) from your sample data.

  4. Find the critical value: Using your chosen $\alpha$, find the Z-score from the Z-table that corresponds to having $\alpha$ area in the upper tail. For example, if $\alpha = 0.05$, the critical Z-value is approximately +1.645.

  5. Make a decision: If your calculated test statistic is greater than the critical value (Z{\text{calc}} > Z{\text{critical}}), or if your P value is less than or equal to $\alpha$, then you reject the null hypothesis. Otherwise, you fail to reject it.

6
New cards

Two tail test

A two tail test (or two-sided test) is used when the alternative hypothesis states that the population parameter is different from (not equal to) a specified value. This means that extreme results in either direction (significantly higher or significantly lower than the null hypothesis value) would lead to the rejection of the null hypothesis. Therefore, there are two critical regions, one in each tail of the sampling distribution. To solve a question using a two tail test:

  1. Formulate your hypotheses: The alternative hypothesis will be in the form of Ha: \text{parameter} \ne \text{value}. The null hypothesis uses H0: \text{parameter} = \text{value}.

  2. Choose a significance level ($\alpha$) (e.g., 0.05).

  3. Divide the significance level by two: Since there are two tails, each tail will have an area of \alpha/2.

  4. Calculate the test statistic (e.g., Z statistic) from your sample data.

  5. Find the critical values: Using \alpha/2 for each tail, find the two Z-scores (one positive, one negative) from the Z-table that define the critical regions. For example, if $\alpha = 0.05$, then \alpha/2 = 0.025, and the critical Z-values are approximately \pm 1.96.

  6. Make a decision: If the absolute value of your calculated test statistic is greater than the positive critical value (|Z{\text{calc}}| > |Z{\text{critical}}|, or if your P value is less than or equal to $\alpha$, then you reject the null hypothesis. Otherwise, you fail to reject it.

7
New cards

Significance level (alpha)

The significance level (denoted as $\alpha$) is a pre-determined threshold for statistical significance. It represents the maximum probability of making a Type I error, which is the error of rejecting a true null hypothesis. Commonly set values include 0.05 (5%), 0.01 (1%), or 0.10 (10%). To use the significance level in addressing a statistical question:

  1. Select a value for $\alpha$ before conducting the hypothesis test. This choice reflects how much risk you're willing to take of incorrectly rejecting the null hypothesis.

  2. It defines the critical region(s): For a given test, $\alpha$ establishes the boundary (critical value) beyond which the calculated test statistic (e.g., Z-score) is considered statistically significant enough to reject the null hypothesis.

  3. It's used to make a decision based on the P value: After calculating the P value, you compare it to $\alpha$. If P \text{ value } \le \alpha, you reject H0. If P \text{ value } > \alpha, you fail to reject H0. This comparison directly answers whether your results are statistically significant at the chosen level.

8
New cards

Critical value

A critical value is a point on the sampling distribution of a test statistic (e.g., Z-score, t-score) that defines the threshold for rejecting the null hypothesis. It marks the boundary of the critical region (or rejection region). If the calculated test statistic falls within this critical region, the null hypothesis is rejected. To find and use a critical value to solve a statistical question:

  1. Specify the significance level ($\alpha$) and determine whether the test is one-tailed (right or left) or two-tailed.
  2. Consult the appropriate statistical table (e.g., Z-table for Z-tests) or use statistical software.
    • For a right-tailed test, find the Z-score associated with an area of $1 - \alpha$ to its left.
    • For a left-tailed test, find the Z-score associated with an area of $\alpha$ to its left.
    • For a two-tailed test, find the Z-scores associated with areas of \alpha/2 and 1 - \alpha/2 to their left. These will be two values, one positive and one negative.
  3. Compare your calculated test statistic to the critical value(s):
    • If the test statistic falls in the critical region (e.g., for a right-tail test, Z{\text{calc}} > Z{\text{critical}}; for a two-tail test, |Z{\text{calc}}| > |Z{\text{critical}}|), then you reject H0. Otherwise, you fail to reject H0.
9
New cards

Sample proportion (p hat)

The sample proportion (denoted as \hat{p}) is a statistic that represents the proportion of "successes" or observations with a specific characteristic within a sample. It is calculated as the number of successes divided by the total sample size. It serves as an estimate of the true population proportion ($p$). To calculate and use sample proportion in a statistical question:

  1. Identify the number of "successes" ($x$) in your sample (e.g., number of people who agree with a statement, number of defective items).
  2. Determine the total sample size ($n$) (e.g., total number of people surveyed, total items inspected).
  3. Calculate the sample proportion using the formula: \hat{p} = x/n.
  4. Use \hat{p} in hypothesis tests or confidence intervals for proportions: It is a crucial component in calculating the Z statistic for proportions: Z = (\hat{p} - p) / \sqrt{p(1-p)/n}.
10
New cards

Standard deviation (sigma)

The standard deviation (denoted as $\sigma$ for a population and $s$ for a sample) is a widely used measure of the amount of variation or dispersion of a set of values around the mean. A small standard deviation indicates that the data points tend to be close to the mean, while a large standard deviation indicates that the data points are spread out over a wider range of values. To utilize standard deviation in solving a statistical question:

  1. Understand the context: Identify if you are working with a population standard deviation ($\sigma$, typically known or assumed for Z-tests) or a sample standard deviation ($s$, an estimate from sample data).
  2. For Z-tests (when $\sigma$ is known): The population standard deviation $\sigma$ is directly used in the denominator of the Z-statistic formula for sample means, often divided by the square root of the sample size to get the standard error of the mean: Z = (\bar{x} - \mu) / (\sigma / \sqrt{n}).
  3. For Z-tests for proportions: While $\sigma$ isn't directly used, the concept of spread is captured by the standard error of the proportion, which is calculated based on the hypothesized proportion $p$ and sample size $n$: \text{SE}(\hat{p}) = \sqrt{p(1-p)/n}. This term acts as the equivalent of 'standard deviation' for the sampling distribution of proportions, central to the Z statistic for proportions.