E

Lecture 4 Notes

Comparing Two Group Means

  • Previously, measures of central tendency (mean, median, mode) were discussed for single variables.

  • Now, the focus is on comparing means between two groups.

  • Example: Comparing income between residents of the North Shore and Sutherland.

Independent Samples

  • Independent samples involve comparing a dependent variable (e.g., income) across two mutually exclusive groups.

  • This is commonly used in digital marketing for A/B testing.

  • Groups must be mutually exclusive (e.g., day or night, location in Sydney or Melbourne).

  • Assessment criteria include understanding independent samples.

Requirements for Independent Samples T-Test

  • The test variable (dependent variable) must be continuous (numeric with interval values that can be continuously divided).

  • Examples: income, age, temperature, height, weight.

  • Grouping variable can be nominal (mutually exclusive) or ordinal (can also be mutually exclusive).

  • Example: Study time broken down by whether a person received an HD or not (two groups).

  • Hypothesis: Testing if people who got an HD studied more or less than those who didn't.

Statistical Significance

  • Just because there is a difference, for example, people with HDs studied a few minutes longer than people with Ds, it does not mean the difference is statistically significant.

  • Example: If people in the North Shore make $100,000.01 on average and those in Sutherland make $100,000, the difference is technically $0.01, but it may not be statistically significant.

  • The standard deviation indicates the spread of the data which may affect the test. Even though something's a little bit higher, because there's variation in the results, this may affect your test.

SPSS and Interpretation

  • Using SPSS, you don't have to do manual calculations; you just need to interpret the output.

  • AB testing examples: website design variations, coupon usage impact on sales.

  • Coupon usage is mutually exclusive: you either used a coupon or you didn't.

  • Goal: Determine if using a coupon increased sales (marketing problem).

Review of Basic Statistics

  • Greek letters (\mu, \,\gamma, \,\sigma) represent true population parameters.

  • \mu (mu) is the population mean.

  • Due to time and budget constraints, we infer \mu using sample data.

  • Sample mean (X bar) and standard deviation (s) are used to make inferences about population parameters.

Hypothesis Testing

  • Research hypothesis is the alternative hypothesis.

  • The null hypothesis is either the counter to the alternative or the status quo.

  • The null hypothesis always has an equal sign.

  • The alternative hypothesis always has some form of "not equal".

Forms of Alternative Hypothesis

  • Three forms of the alternative hypothesis:

    • \mu1 - \mu2 \neq 0

    • \mu1 - \mu2 > 0

    • \mu1 - \mu2 < 0

Rejecting or Not Rejecting the Null Hypothesis

  • If the p-value is less than 0.05, reject the null hypothesis and conclude the alternative is true.

  • If the p-value is greater than 0.05, you cannot reject the null hypothesis.

  • Not rejecting the null does not mean accepting it; it just means there isn't enough evidence to reject it.

Three Forms of the Test

  • Three forms for testing differences in population means:

    • \mu1 < \mu2

    • \mu1 > \mu2

    • \mu1 \neq \mu2

  • The alternative hypothesis never has any form of equals to in the alternative.

  • In the null, it's greater than or equal to, equal to, or less than.

Research Question and Alternative Hypothesis

  • The research question is the alternative hypothesis.

  • Develop the research hypothesis first, then create the null hypothesis as the counter.

Two-Tailed vs. One-Tailed Tests

  • Two-tailed test: Tests if there is a difference between groups (not equal to).

  • One-tailed test: Tests for a directional difference (greater than or less than).

  • Also referred to as directional.

  • The middle one is a two-tailed test, and other tests here are directional.

SPSS Analysis: Gender and Spending Example

  • Using SPSS to analyze spending by gender (male and female).

  • The data set includes spending amounts and gender identification.

  • Examine the distribution of spending by gender (must be some type of mutually exclusive).

Steps for Conducting an Independent Samples T-Test in SPSS

  1. Analyze > Compare Means > Independent Samples T-Test.

  2. Select the test variable (amount spent) and the grouping variable (gender).

  3. Define the groups (e.g., male and female).

  4. Interpret the output.

Interpreting SPSS Output

  • Histograms help visualize the distribution of spending by gender.

  • Normal distribution curves overlaid on the histograms aid in interpretation.

  • Visually assess if there is a noticeable difference in spending between groups.

Examining Histograms and Distributions

  • Example: The male normal distribution appears slightly to the right, indicating higher average spending.

  • Displaying percentages on the y-axis makes the distributions more directly comparable.

Independent Samples T-Test Results: Gender and Spending

  • Males appear to spend more on average ($4.30 vs. $3.65).

  • Determine if the difference statistically significantly different.

Levene's Test for Equality of Variances

  • Levene's test checks if the variances of the two groups are equal.

    • Null hypothesis: Variances are equal.

    • Alternative: variances are not equal.

  • If the p-value (Sig. level) is less than 0.05, reject the null and assume unequal variances.

  • If the p-value is greater than 0.05, you cannot reject the null, indicating equal variances.

  • If variances are equal, use the information on the top row. If unequal, use the bottom row.

Determining Statistical Significance

  • Check the Levine Test. Were the variances equal, not equal? Depending on which, use the top row or the bottom row?

  • Refer to the p-value from the t-test.

  • If the p value is less than 0.05, reject the null since they are not equal to each other.

  • (Note: even though it's clear males spent more, we are testing for the assumption that it could be higher, could be lower.

Coupon Example in SPSS

  • Coupon variable has four levels: No, from a newspaper, from mailer, from both.

  • Test cases: No coupon vs. newspaper coupon; no coupon vs. both sources coupon.

  • Test the efficacy of different coupon strategies from a managerial decision standpoint.

Visualizing Data with Box Plots

  • Box plots provide a simple summary of the variable by category.

  • The dark line in the middle of the box represents the median.

  • If one box doesn't overlap the with the other box, it's statistically significantly different.

Analysis: No Coupon vs. Newspaper Coupon

  • Visually, the box plots for "no coupon" and "from newspaper" look similar.

  • If the p-value is greater than 0.05 can't reject any null, which the table showed. Because both box plots and test results looked equal.

One-Tailed Test Directional

  • Tests the hypothesis that group A is greater than group B, or vice versa.

Testing for Greater - Coupon Example

  • Had the experiment testing if no coupon spending is greater than something from a newspaper coupon, the significance is 0.331

  • Since significance is still greater than 0.05, you still can't reject then

Group 1 vs Group 4 with a One-Direction Test with no coupon less than both

  • Alternative hypothesis: is SPENDING from both More than No coupon?

  • Had the Levine test, you look to see if the significance is less than 0.05 (it's 0.03) - so we reject the NOON because they're equal - use from the BOTTOM!

  • Level is a list of 0.05 - which means you reject and conclude and we can conclude yes the average spending on no coupon is. Is statistically significantly less then spending from both!