Hypothesis testing - two sample independent and dependent, type 1/2 errors

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/15

Earn XP

Description and Tags

Topic 6

Last updated 6:02 PM on 3/23/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

16 Terms

New cards

Hypothesis

Statement about the value of a population parameter that is subject to verification

New cards

Hypothesis testing

A procedure based on sample evidence and probability theory to determine whether the hypothesis is a reasonable statement

New cards

One and two sample testing

One: One sample against population

Two: Two samples against each other

New cards

Three assumptions needed to conduct one sample hypothesis testing

Random sampling is employed

Level of measurement is interval or ratio - in order to calculate the mean

Sampling distribution is normal - we can be sure of this if the sample size is large enough - as per the central limit theorem

New cards

Null and alternate hypothesis

Null: Statement about the value of a population parameter developed for the purpose of testing numerical evidence - equalities are always part of the null (=, \le,\ge ) - NEVER use the word accept for a null hypothesis must say ‘fail to reject the null hypothesis’

Alternate: Statement that is ACCEPTED if the sample data provides sufficient evidence that H₀ is false - inequalities are always part of the alternate (\ne,<,> )

Always assume H₀ is true

Null hypothesis is NOT the same as a research hypothesis

New cards

Steps for one sample hypothesis test

State null (H₀) - and Alternate (H₁) hypothesis
Select a level of significance = the probability of rejecting the null hypothesis when it is true

Typically choose \alpha=0.05 (same as 95% confidence level for CI)

Select the test statistic - A value determined from sample info used to decide whether to reject the null hypothesis

If the population standard deviation is known we use standard normal distribution (z)
If population standard deviation is unknown but the sample s large s is used to substitute population s.d. and we still use standard normal distribution (z)
If the population standard deviation is unknown and the sample is small we use the t-distribution - use table with degrees of freedom for 0.05 sig. level to get t value - which we use to decide if we reject H₀

Formulate the decision rule - involves determining the rejection area of the sampling distribution of the test statistic e.g. find the cut off values where 5% of the area under the distribution is in the tails - or for two-tailed test 2.5% in each tail If test statistic falls in the rejection region - below 0.05 for one-tail, and below 0.025/above 0.975 for two-tail we reject H₀
Make a decision - and state conclusion e.g. on average there is/isn’t a statistical different… in context - and interpret results

New cards

Critical value

The dividing point between the region where the null hypothesis is rejected and the region where it is not rejected

New cards

p-Value + how to find

The probability of observing a sample value as extreme as, or more extreme than the value observed, given that the null hypothesis is true

Use distribution (z/t) in reverse - compare the z/t statistic we calculated with the table to find the associated probability value - usually p values calculated for a two-tailed hypothesis

The lower the p-value is the more confident we can be in rejecting the null hypothesis e.g. if p-value is less than 0.001 we have extremely strong evidence that H₀ isn’t true compared to if the p-value is less than 0.10 we have some evidence that H₀ isn’t true

New cards

Two sample hypothesis test

Aims to find if there is a significant difference between two sample means

Same statistical principles as in one sample testing but instead of population data need data from each sample

New cards

Three assumptions needed to conduct two sample hypothesis testing

Two independent random samples are used

Level of measurements is interval/ratio - in order to calculate the mean

Sampling distribution is normal - we can be sure of this if the sample size is large enough as per CLT)

New cards

Steps for two sample hypothesis test

State null hypothesis (H₀:\mu_1=\mu_2) AND alternate hypothesis (H₁:\mu_1\ne\mu_2)
Choose the level of significance - typically\alpha = 0.05
Choose the test statistic - use this formula to find z:

z=\frac{\left(\overline{x_{}}_1-\overline{x}_2\right)}{\sqrt{\left(\frac{\sigma_1^2}{n_1}\right)+\left(\frac{\sigma_2^2}{n_2}\right)}} - where we assume H₀ to be true for the test

Formulate the decision rule - for alpha = 0.05 the critical value is ±1.96
Make a decision e.g. if z > 1.96 we can reject H₀

New cards

Two sample hypothesis test with unknown SAME INDEPENDENT sample standard deviations

Like the one sample case we substitute the sample standard deviation for the population standard deviation

Need to pool sample variances using formula

s_{p}^2=\frac{\left(n_1-1\right)s_1^2+\left(n_2-1\right)s_2^2}{n_1+n_2-2}

Then use t-statistic as follows:

t=\frac{\overline{x}_1-\overline{x}_2}{\sqrt{s_{p}^2\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}

If t value falls in critical region we CAN reject H₀

New cards

Two sample hypothesis test with unknown DIFFERENT INDEPENDENT sample standard deviations

Use t-statistic formula:

t=\frac{\overline{x}_1-\overline{x}_2}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}

But we adjust the degrees of freedom downward (increasing uncertainty)

Using this formula

df=\frac{\left\lbrack\left(\frac{s_1^2}{n_1}\right)+\left(\frac{s_2^2}{n_2}\right)\right\rbrack^2}{\frac{\left(\frac{s_1^2}{n_1}\right)^2}{n_1-1}+\frac{\left(\frac{s_2^2}{n_2}\right)^2}{n_2-1}}

Round down d.f. to be on the safe side - then find critical value for significance level

If t value is less than critical value from degree of freedom table we can reject H₀

New cards

Two sample hypothesis test with DEPENDENT samples

Find t value using

t=\overline{\frac{d}{\frac{s_{d}}{\sqrt{n}}}} where \overline{d} is the sample mean difference in the pair of related observations, s_d is the standard deviation of these differences and n is the number of paired observations

s_dis found using formula:

s_{d}=\sqrt{\frac{\Sigma_{i=1}^{n}\left(d_{i}-\overline{d}\right)^2}{n-1}}

Then if t value is higher than s_d we can reject H₀

New cards

Errors in hypothesis testing

Type 1 error: incorrectly rejecting H₀ when it is actually true e.g. might occur when we choose sig. level of 0.05 instead of 0.01

Type 2 error: incorrectly failing to reject H₀when it is actually false - chance of this is higher as sig. level decreases e.g. 0.001 instead of 0.01

New cards

Probability of making a type-2 error

Identified by Greek letter beta \beta

Use formula to find z value

z=\frac{\overline{x}_{c}-\mu_1}{\frac{\sigma}{\sqrt{n}}}