Clip 7 statistics Comparing Two Means: T-test Notes

Hypothesis tests are used to make inferences about populations based on sample data.
Types of hypothesis tests:
- One Population:
- Mean: t-test (Section 9.4)
- Proportion: Z-test (Section 9.5)
- Variance: F-test (Section 11.1, noted for dropping)
- Two Populations:
- Mean: t-test (Sections 10.2 & 10.3)
- Proportion: Z-test (Section 10.4)
- Variance: F-test (Section 11.2, noted for dropping)
The process and types of t-tests will be discussed in detail below regarding comparisons of means.

When comparing one mean to a fixed number, data is collected from one sample (n).
Two scenarios exist for determining which test to use:
- If the standard deviation of the population is known, a Z-test is utilized.
- If the standard deviation is not known and must be estimated from the sample, a t-test is applied. In practice, the standard deviation is typically unknown.

A customer survey conducted among buyers of furniture provided data from 833 respondents.
Collected Variables include:
- Socio-demographic variables.
- Purchase behavior over the last 12 months.
- Attitudes, among other relevant factors.

The purpose of the test is to determine whether customer expenditures differ from a specified threshold of 1000 euros per shopping basket.
Null and Alternative Hypotheses:
- H0: μ = 1000 (The mean expenditure equals 1000 euros)
- H1 (two options):
- One-sided: μ > 1000
- One-sided: μ < 1000
- Two-sided: μ ≠ 1000
In this case, a two-sided test is more appropriate.
In SPSS, follow the path:
- Analyze → Compare Means → One-Sample T Test.

The test statistic for the sample resulted in the calculation:
$744.648 / \sqrt{833}$
Resulting p-value: p = 0.125, which is greater than level of significance α = 0.05.
- Therefore, we do not reject H0, concluding that the mean basket size in the sample is not significantly different from 1000 euros (remains valid even at 0.10 significance level).

The mean satisfaction is measured on a scale of 1-7, where:
- 1 = dissatisfied
- 4 = neutral
- 7 = very satisfied
Null and Alternative Hypotheses:
- H0: μ = 4 (mean satisfaction equals neutral)
- H1 (one-sided): μ > 4
Conclusion for One-Sided Testing:
- Resulting p-value: p < 0.001, which is significantly lower than α = 0.05.
- Thus, we reject H0; the mean satisfaction is significantly higher than 4, with an observed mean of 6.07, indicating customers are satisfied to very satisfied on average.

A manager claims that more than 25% of purchases are made online, a proportion needing testing.
Null and Alternative Hypotheses:
- H0: p = 0.25 (the proportion of online purchases is 25%)
- H1: p > 0.25
In SPSS, follow the path:
- Analyze → Compare Means → One-Sample Proportions.

Within the sample, it was found that 243 purchases were made online out of 833 total purchases.
Resulting p-value: p = 0.003, which is smaller than α = 0.05 and 0.01.
- Therefore, we reject H0 and conclude that more than 25% of the purchases are made online.

Data comparison is done via two samples drawn from two distinct populations, analyzed using various techniques based on the knowledge of population standard deviations.
Main scenarios:
1) Standard deviations of both populations are known, using a Z-test (rarely the case).
2) Standard deviations are not known, requiring estimation from the sample; in this case, a t-test is used.

Cases can be further broken down based on whether standard deviations are assumed to:
- Be equal across samples.
- Be different across samples.

Expected Value of the difference:
$E(X₁ - X₂) = μ₁ - μ₂$
Standard Deviation (Standard Error):
$SE = \sqrt{\frac{σ₁^2}{n₁} + \frac{σ₂^2}{n₂}}$
Interval estimate provided by the calculated means based on sample variance.

Using independent samples, the test statistic follows a t-distribution:
Equations based on whether variances are considered equal or unequal:
- If equal:
  $t = \frac{x̄₁ - x̄₂ - (μ₁ - μ₂)}{s_{p} \sqrt{\frac{1}{n₁} + \frac{1}{n₂}}}$
- If unequal:
  $t = \frac{x̄₁ - x̄₂ - (μ₁ - μ₂)}{s_{p} \sqrt{\frac{1}{n₁} + \frac{1}{n₂}}}$
Where pooled variance estimator is given as:
$s_p^2 = \frac{(n₁ - 1)s₁² + (n₂ - 1)s₂²}{n₁ + n₂ - 2}$

Most computations should be performed using SPSS.
Analyze path structured as:
- Analyze → Compare Means → Independent Samples T Test.
Example fraction: Average expenditures by gender reveal:
- Males: 984.58 euros
- Females: 935.04 euros
Conclusion from the mean difference is not statistically significant.
- p = 0.337, therefore, insufficient evidence to reject H0.

Matched samples indicate related variables matched upon observations rather than distinct sample groups.
For example:
- Comparing weekly sales between two restaurants:
- A table lists paired sales data for 12 specific days.
In SPSS, utilize:
- Analyze → Compare Means → Paired Samples T Test.

The difference in average importance among the top-3 attributes is investigated through testing.
One-sided testing may explore whether attribute 1 is more important than attribute 2, etc.
Conclusion indicates the three attributes differ significantly, backed with relevant p-values.
All differences noted to be less than 0.05 (and less than 0.01); significant differences exist due to large sample size, enhancing statistical power, which is referenced in another knowledge clip regarding hypothesis testing.