Chi-Square Analysis Notes

NONPARAMETRIC ANALYSIS

Chi Square Analysis

  • Analysis for research with nominal data (e.g., yes/no, democrat/republican/independent, male/female).

  • Does not typically identify cause-effect relationships; nothing is usually manipulated.

Chi-Square

  • Analyzes frequencies.

  • Data are category membership (nominal variables).

  • Looks like ANOVA but is not.

  • Two kinds of Chi-Square:

    • Goodness of fit: for 1 nominal variable.

    • Independence: for 2 nominal variables.

  • Both compare expected frequency if the null hypothesis is true to observed frequency.

  • Not for cause and effect.

Chi-Square Goodness of Fit Example: Sex in Advertising

  • Three groups of people are shown commercials and asked if they would purchase the product.

  • Count in each cell represents the number who said they would.

  • Example data:

    • Informative: 8

    • Story: 12

    • Hot-n-Steamy: 40

Choosing the Correct Statistic

  • Variables (1 or 2) are nominal.

  • Data are frequencies.

  • Comparing observed frequencies to expected frequencies if the null hypothesis is true indicates Chi-Square analysis.

  • One variable: Goodness of Fit.

  • Two variables: Test of Independence.

Assumptions

  • Categories must be independent.

  • Data are frequency counts.

  • Sample size must be large such that no cell has an expected frequency < 5.

  • Sex in Advertising example data:

    • Informative: 8

    • Story: 12

    • Hot-n-Steamy: 40

Hypotheses

  • Null Hypothesis: Consumer purchasing decisions are the same regardless of advertising type.

  • Research Hypothesis: Consumer purchasing decisions differ depending on advertising type.

  • Sex in Advertising example data:

    • Informative: 8

    • Story: 12

    • Hot-n-Steamy: 40

Compute Chi Square

  • O = Observed frequency

  • E = Expected frequency (chance or probability)

Compute Chi Square: Sex in Advertising

  • N = 60

  • Expected frequency is N / # categories = 60 / 3 = 20

  • OR, E = some known population proportion (census, population records, etc.).

  • Sex in Advertising Example:

    • Informative: Observed = 8, Expected = 20

    • Story: Observed = 12, Expected = 20

    • Hot-n-Steamy: Observed = 40, Expected = 20

Compute Chi Square

  • Formula: χ2=(OE)2E\chi^2 = \sum \frac{(O - E)^2}{E}

  • Sex in Advertising Example:
    χ2=(820)220+(1220)220+(4020)220=14420+6420+40020=30.4\chi^2 = \frac{(8-20)^2}{20} + \frac{(12-20)^2}{20} + \frac{(40-20)^2}{20} = \frac{144}{20} + \frac{64}{20} + \frac{400}{20} = 30.4

  • Get p from software or calculator; p < .001.

  • Sex in Advertising:

    • Observed: 8, 12, 40

    • Expected: 20, 20, 20

Compute in SPSS

  • Test Statistics:

    • Chi-Square = 30.400

    • df = 2

    • Asymp. Sig. = <.001

    • 0 cells (0.0%) have expected frequencies less than 5. The minimum expected cell frequency is 20.0.

  • Commercial Data:

    • Informative: Observed N = 8, Expected N = 20.0, Residual = -12.0

    • Story: Observed N = 12, Expected N = 20.0, Residual = -8.0

    • Hot n Steamy: Observed N = 40, Expected N = 20.0, Residual = 20.0

    • Total N = 60

Scientific Context

  • Methodological controls were adequate (participants were recruited on a college campus similar to other research of its kind).

  • Results are consistent with decades of advertising research showing that high sexual content attracts purchases for some products.

  • Confidence in the results is high.

Summarize Results

  • Consumers change their purchasing behavior depending on the degree of sex in advertisements, \chi^2 (2, N = 60) = 30.4, p < .001.

  • These results are not likely to be due to sampling error.

  • There was a clear tendency for consumers to report an intent to purchase following “sexy” advertising.

  • Sex in Advertising Data:

    • Observed: 8, 12, 40

    • Expected: 20, 20, 20

Chi Square Test of Independence

Chi Square Example: Independence - Pets and Personality

  • Researcher wants to know if there is an association between pet ownership and personality type.

  • She asked 200 people what type of pet they preferred and administered a personality questionnaire.

  • Data:

    • Cats: Introvert = 55, Extrovert = 20, Total = 75

    • Dogs: Introvert = 22, Extrovert = 73, Total = 95

    • Birds: Introvert = 12, Extrovert = 18, Total = 30

    • Totals: Introvert = 89, Extrovert = 111, N = 200

Choosing the Correct Statistic

  • Variable(s) is/are nominal.

  • Data are frequencies.

  • Comparing observed frequencies to expected frequencies if the null hypothesis is true indicates Chi Square analysis.

  • One variable: Goodness of Fit.

  • Two variables: Test of Independence.

Assumptions

  • Categories must be independent.

  • Data are frequency counts.

  • Sample size must be large such that no cell has an expected frequency < 5.

  • Pets and Personality Data:

    • Cats: Introvert = 55, Extrovert = 20, Total = 75

    • Dogs: Introvert = 22, Extrovert = 73, Total = 95

    • Birds: Introvert = 12, Extrovert = 18, Total = 30

    • Totals: Introvert = 89, Extrovert = 111, N = 200

Hypotheses

  • Null Hypothesis: There is no association between pet ownership and personality type.

  • Research Hypothesis: There is an association between pet ownership and personality type.

  • Pets and Personality Data:

    • Cats: Introvert = 55, Extrovert = 20, Total = 75

    • Dogs: Introvert = 22, Extrovert = 73, Total = 95

    • Birds: Introvert = 12, Extrovert = 18, Total = 30

    • Totals: Introvert = 89, Extrovert = 111, N = 200

Compute Chi Square

  • Expected frequency = (Row Total x Column Total) / N

  • Pets and Personality Data:

    • Cats: Introvert = (89 * 75) / 200, Extrovert = (111 * 75) / 200, Total = 75

    • Dogs: Introvert = (89 * 95) / 200, Extrovert = (111 * 95) / 200, Total = 95

    • Birds: Introvert = (89 * 30) / 200, Extrovert = (111 * 30) / 200, Total = 30

    • Totals: Introvert = 89, Extrovert = 111, N = 200

Compute Chi Square

  • Pets and Personality: χ2=(5533.375)233.375+(2242.275)242.275++(1816.65)216.65=43.013\chi^2 = \frac{(55-33.375)^2}{33.375} + \frac{(22-42.275)^2}{42.275} + … + \frac{(18-16.65)^2}{16.65} = 43.013

  • χ2(2,N=200)=43.013\chi^2 (2, N=200) = 43.013

  • Data (Observed (Expected)):

    • Cats: Introvert = 55 (33.375), Extrovert = 20 (41.625), Total = 75

    • Dogs: Introvert = 22 (42.275), Extrovert = 73 (52.725), Total = 95

    • Birds: Introvert = 12 (13.35), Extrovert = 18 (16.65), Total = 30

    • Totals: Introvert = 89, Extrovert = 111, N = 200

Compute Chi Square in SPSS

  • Personality * Pet_Choice Crosstabulation:

    • Pet_Choice: Cat, Dog, Bird, Total

    • Personality: Introvert, Extrovert, Total

  • Chi-Square Tests:

    • Pearson Chi-Square = 43.013, df = 2, Asymptotic Significance (2-sided) = <.001

    • Likelihood Ratio = 44.642, df = 2, Asymptotic Significance (2-sided) = <.001

    • Linear-by-Linear Association = 22.415, df = 1, Asymptotic Significance (2-sided) = <.001

    • N of Valid Cases = 200

    • 0 cells (0.0%) have expected count less than 5. The minimum expected count is 13.35.

Compute Effect Size

  • If two categories in either variable, use Phi. Otherwise, use Cramer’s V.

  • Φ=χ2N(k1)\Phi = \sqrt{\frac{\chi^2}{N(k-1)}}, where k is the smaller of the number of categories for each variable.

  • Φ=.46\Phi = .46, which is a large effect.

Scientific Context

  • Methodological controls were adequate (participants were recruited on a college campus similar to other research of its kind).

  • Results are consistent with literature suggesting introverted people tend to prefer more solitary companion pets than extroverts.

  • We can have confidence in the results.

Summarize Results

  • There is an association between pet type and personality, \chi^2(2, N=200) = 43.01, p < .001, Cramer’s V = .46.

  • Introverts tended to prefer cats whereas extroverts tended to prefer dogs. There was no clear preference for birds.

  • Data:

    • Cats: Introvert = 55 (33.375), Extrovert = 20 (41.625), Total = 75

    • Dogs: Introvert = 22 (42.275), Extrovert = 73 (52.725), Total = 95

    • Birds: Introvert = 12 (13.35), Extrovert = 18 (16.65), Total = 30

    • Totals: Introvert = 89, Extrovert = 111, N = 200

NONPARAMETRIC ANALYSIS - SUMMARY

  • Used for nominal and/or ordinal data.

  • No “parameters” like mu and sigma, normality, etc.

  • Chi square is like a correlation: looking for a relationship or pattern.

  • Exploring patterns, not cause/effect.

  • Goodness of Fit: Relative Frequencies in different categories.

  • Independence: Looks for an association between two variables.

Popular Quiz

  • Q: A researcher asked participants, “Would you say you disapprove of increased government regulation of banks and major financial institutions?” Answers were “Approve”, “Disapprove” and “No Opinion”. What is the best statistic to analyze these data?

  • A: Chi Square Goodness of Fit

  • Q: Nominal and ordinal data are best analyzed using which CLASS of statistical test?

  • A: Nonparametric Statistics

  • Q: How many pairs of observed and expected frequencies will you compute when using a chi square?

  • A: As many pairs as there are categories in the study.

  • Q: If the summed relative differences (i.e., your calculated chi square value) is greater than the critical value, you would

  • A: Reject the null.

  • Q: The numerator of the chi-square computes the difference between the observed and expected frequencies. IF the null is true, this difference will likely be close to

  • A: 0

  • Q: IN THE TABLE BELOW, IS A CHI SQUARE TEST APPROPRIATE?

    • Vanilla : Observed = 4

    • Chocolate : Observed = 6

    • Strawberry : Observed = 2

    • Expected

  • A: NO

  • Q: In the above table, what is the expected frequency for FEMALES who are NOT VEGETARIAN?

    • Vegetarian: Male = 6, Female = 14

    • Not Vegetarian: Male = 24, Female = 16

  • A: 20

Which significance test and why?

  • Has voter satisfaction with a politician changed from what it was one year ago? Repeated t test

  • Do men enjoy violent movies more than women? Independent t test

  • Are socio-economic (i.e., below average, average, and above average) and political affiliation (Democrat, Independent, and Republican) associated? Chi square for independence

  • Do socio-economic (i.e., below average, average, and above average) and political affiliation (Democrat, Independent, and Republican) jointly effect attitudes about immigration (measured on a scale from 1 to 10)? 2-way ANOVA

  • Are the proportions of citizens in each socio-economic status category (i.e., far below average, below average, average, etc.) equal? Goodness of fit Chi square

  • Are the movie ratings different for Democrats, Independents, and Republicans? 1-way ANOVA

Okay, let's break down the key concepts from the note into simple terms to help you ace your exam. Here's a simplified overview:

  1. Chi-Square Analysis: Think of this as a way to check if two things are related when you have opinions or categories (like 'yes' or 'no,' or 'cat person' vs. 'dog person'). It's not for finding cause and effect, just whether there's a pattern.

  2. Types of Chi-Square Tests:

    • Goodness of Fit: Use this when you want to see if one thing (like opinions on a new product) fits a certain pattern or expectation.

    • Test of Independence: Use this when you want to see if two things (like pet preference and personality) are related to each other.

  3. Key Assumptions for Chi-Square Tests:

    • Your categories need to be separate (like, each person only gets one vote).

    • You need counts (how many people said yes, how many said no).

    • Make sure you have enough people in each group (at least 5).

  4. Hypotheses: You'll have a guess that there is no relationship (Null Hypothesis) and a guess that there is a relationship (Research Hypothesis).

  5. Calculating Chi-Square: Compare what you expected to see with what you actually saw. A big difference means there's probably a relationship.

  6. Effect Size: Tells you how strong the relationship is. For example, Cramer's V tells you how strong the association between the variables are.

  7. Nonparametric Analysis: This is used when your

the difference between phi and cramers v and when to use them and why theyre use