Clip 7 statistics Comparing Two Means: T-test Notes

Overview of Hypothesis Tests

  • Hypothesis tests are used to make inferences about populations based on sample data.

  • Types of hypothesis tests:

    • One Population:

    • Mean: t-test (Section 9.4)

    • Proportion: Z-test (Section 9.5)

    • Variance: F-test (Section 11.1, noted for dropping)

    • Two Populations:

    • Mean: t-test (Sections 10.2 & 10.3)

    • Proportion: Z-test (Section 10.4)

    • Variance: F-test (Section 11.2, noted for dropping)

  • The process and types of t-tests will be discussed in detail below regarding comparisons of means.

Comparing One Mean to a Fixed Number

  • When comparing one mean to a fixed number, data is collected from one sample (n).

  • Two scenarios exist for determining which test to use:

    • If the standard deviation of the population is known, a Z-test is utilized.

    • If the standard deviation is not known and must be estimated from the sample, a t-test is applied. In practice, the standard deviation is typically unknown.

Example with SPSS

  • A customer survey conducted among buyers of furniture provided data from 833 respondents.

  • Collected Variables include:

    • Socio-demographic variables.

    • Purchase behavior over the last 12 months.

    • Attitudes, among other relevant factors.

Testing Mean Expenditures

  • The purpose of the test is to determine whether customer expenditures differ from a specified threshold of 1000 euros per shopping basket.

  • Null and Alternative Hypotheses:

    • H0: μ = 1000 (The mean expenditure equals 1000 euros)

    • H1 (two options):

    • One-sided: μ > 1000

    • One-sided: μ < 1000

    • Two-sided: μ ≠ 1000

  • In this case, a two-sided test is more appropriate.

  • In SPSS, follow the path:

    • Analyze → Compare Means → One-Sample T Test.

Conclusion for Mean Expenditures Test

  • The test statistic for the sample resulted in the calculation:
    744.648/833744.648 / \sqrt{833}

  • Resulting p-value: p = 0.125, which is greater than level of significance α = 0.05.

    • Therefore, we do not reject H0, concluding that the mean basket size in the sample is not significantly different from 1000 euros (remains valid even at 0.10 significance level).

Customer Satisfaction Testing

  • The mean satisfaction is measured on a scale of 1-7, where:

    • 1 = dissatisfied

    • 4 = neutral

    • 7 = very satisfied

  • Null and Alternative Hypotheses:

    • H0: μ = 4 (mean satisfaction equals neutral)

    • H1 (one-sided): μ > 4

  • Conclusion for One-Sided Testing:

    • Resulting p-value: p < 0.001, which is significantly lower than α = 0.05.

    • Thus, we reject H0; the mean satisfaction is significantly higher than 4, with an observed mean of 6.07, indicating customers are satisfied to very satisfied on average.

Testing Proportions of Online Purchases

  • A manager claims that more than 25% of purchases are made online, a proportion needing testing.

  • Null and Alternative Hypotheses:

    • H0: p = 0.25 (the proportion of online purchases is 25%)

    • H1: p > 0.25

  • In SPSS, follow the path:

    • Analyze → Compare Means → One-Sample Proportions.

Conclusion for Proportions

  • Within the sample, it was found that 243 purchases were made online out of 833 total purchases.

  • Resulting p-value: p = 0.003, which is smaller than α = 0.05 and 0.01.

    • Therefore, we reject H0 and conclude that more than 25% of the purchases are made online.

Comparing Two Means

  • Data comparison is done via two samples drawn from two distinct populations, analyzed using various techniques based on the knowledge of population standard deviations.

  • Main scenarios:
    1) Standard deviations of both populations are known, using a Z-test (rarely the case).
    2) Standard deviations are not known, requiring estimation from the sample; in this case, a t-test is used.

Independent Samples

  • Cases can be further broken down based on whether standard deviations are assumed to:

    • Be equal across samples.

    • Be different across samples.

Testing Procedures for Known Standard Deviations

  • Expected Value of the difference:
    E(X1X2)=μ1μ2E(X₁ - X₂) = μ₁ - μ₂

  • Standard Deviation (Standard Error):
    SE=σ12n1+σ22n2SE = \sqrt{\frac{σ₁^2}{n₁} + \frac{σ₂^2}{n₂}}

  • Interval estimate provided by the calculated means based on sample variance.

Hypothesis Testing when Standard Deviations are Not Known

  • Using independent samples, the test statistic follows a t-distribution:

  • Equations based on whether variances are considered equal or unequal:

    • If equal:
      t=xˉ1xˉ2(μ1μ2)sp1n1+1n2t = \frac{x̄₁ - x̄₂ - (μ₁ - μ₂)}{s_{p} \sqrt{\frac{1}{n₁} + \frac{1}{n₂}}}

    • If unequal:
      t=xˉ1xˉ2(μ1μ2)sp1n1+1n2t = \frac{x̄₁ - x̄₂ - (μ₁ - μ₂)}{s_{p} \sqrt{\frac{1}{n₁} + \frac{1}{n₂}}}

  • Where pooled variance estimator is given as:
    sp2=(n11)s12+(n21)s22n1+n22s_p^2 = \frac{(n₁ - 1)s₁² + (n₂ - 1)s₂²}{n₁ + n₂ - 2}

Utilizing SPSS for Testing Means

  • Most computations should be performed using SPSS.

  • Analyze path structured as:

    • Analyze → Compare Means → Independent Samples T Test.

  • Example fraction: Average expenditures by gender reveal:

    • Males: 984.58 euros

    • Females: 935.04 euros

  • Conclusion from the mean difference is not statistically significant.

    • p = 0.337, therefore, insufficient evidence to reject H0.

Matched Samples Testing

  • Matched samples indicate related variables matched upon observations rather than distinct sample groups.

  • For example:

    • Comparing weekly sales between two restaurants:

    • A table lists paired sales data for 12 specific days.

  • In SPSS, utilize:

    • Analyze → Compare Means → Paired Samples T Test.

Importance of Attributes Testing

  • The difference in average importance among the top-3 attributes is investigated through testing.

  • One-sided testing may explore whether attribute 1 is more important than attribute 2, etc.

  • Conclusion indicates the three attributes differ significantly, backed with relevant p-values.

  • All differences noted to be less than 0.05 (and less than 0.01); significant differences exist due to large sample size, enhancing statistical power, which is referenced in another knowledge clip regarding hypothesis testing.