Chi-Square Analysis Notes
NONPARAMETRIC ANALYSIS
Chi Square Analysis
Analysis for research with nominal data (e.g., yes/no, democrat/republican/independent, male/female).
Does not typically identify cause-effect relationships; nothing is usually manipulated.
Chi-Square
Analyzes frequencies.
Data are category membership (nominal variables).
Looks like ANOVA but is not.
Two kinds of Chi-Square:
Goodness of fit: for 1 nominal variable.
Independence: for 2 nominal variables.
Both compare expected frequency if the null hypothesis is true to observed frequency.
Not for cause and effect.
Chi-Square Goodness of Fit Example: Sex in Advertising
Three groups of people are shown commercials and asked if they would purchase the product.
Count in each cell represents the number who said they would.
Example data:
Informative: 8
Story: 12
Hot-n-Steamy: 40
Choosing the Correct Statistic
Variables (1 or 2) are nominal.
Data are frequencies.
Comparing observed frequencies to expected frequencies if the null hypothesis is true indicates Chi-Square analysis.
One variable: Goodness of Fit.
Two variables: Test of Independence.
Assumptions
Categories must be independent.
Data are frequency counts.
Sample size must be large such that no cell has an expected frequency < 5.
Sex in Advertising example data:
Informative: 8
Story: 12
Hot-n-Steamy: 40
Hypotheses
Null Hypothesis: Consumer purchasing decisions are the same regardless of advertising type.
Research Hypothesis: Consumer purchasing decisions differ depending on advertising type.
Sex in Advertising example data:
Informative: 8
Story: 12
Hot-n-Steamy: 40
Compute Chi Square
O = Observed frequency
E = Expected frequency (chance or probability)
Compute Chi Square: Sex in Advertising
N = 60
Expected frequency is N / # categories = 60 / 3 = 20
OR, E = some known population proportion (census, population records, etc.).
Sex in Advertising Example:
Informative: Observed = 8, Expected = 20
Story: Observed = 12, Expected = 20
Hot-n-Steamy: Observed = 40, Expected = 20
Compute Chi Square
Formula:
Sex in Advertising Example:
Get p from software or calculator; p < .001.
Sex in Advertising:
Observed: 8, 12, 40
Expected: 20, 20, 20
Compute in SPSS
Test Statistics:
Chi-Square = 30.400
df = 2
Asymp. Sig. = <.001
0 cells (0.0%) have expected frequencies less than 5. The minimum expected cell frequency is 20.0.
Commercial Data:
Informative: Observed N = 8, Expected N = 20.0, Residual = -12.0
Story: Observed N = 12, Expected N = 20.0, Residual = -8.0
Hot n Steamy: Observed N = 40, Expected N = 20.0, Residual = 20.0
Total N = 60
Scientific Context
Methodological controls were adequate (participants were recruited on a college campus similar to other research of its kind).
Results are consistent with decades of advertising research showing that high sexual content attracts purchases for some products.
Confidence in the results is high.
Summarize Results
Consumers change their purchasing behavior depending on the degree of sex in advertisements, \chi^2 (2, N = 60) = 30.4, p < .001.
These results are not likely to be due to sampling error.
There was a clear tendency for consumers to report an intent to purchase following “sexy” advertising.
Sex in Advertising Data:
Observed: 8, 12, 40
Expected: 20, 20, 20
Chi Square Test of Independence
Chi Square Example: Independence - Pets and Personality
Researcher wants to know if there is an association between pet ownership and personality type.
She asked 200 people what type of pet they preferred and administered a personality questionnaire.
Data:
Cats: Introvert = 55, Extrovert = 20, Total = 75
Dogs: Introvert = 22, Extrovert = 73, Total = 95
Birds: Introvert = 12, Extrovert = 18, Total = 30
Totals: Introvert = 89, Extrovert = 111, N = 200
Choosing the Correct Statistic
Variable(s) is/are nominal.
Data are frequencies.
Comparing observed frequencies to expected frequencies if the null hypothesis is true indicates Chi Square analysis.
One variable: Goodness of Fit.
Two variables: Test of Independence.
Assumptions
Categories must be independent.
Data are frequency counts.
Sample size must be large such that no cell has an expected frequency < 5.
Pets and Personality Data:
Cats: Introvert = 55, Extrovert = 20, Total = 75
Dogs: Introvert = 22, Extrovert = 73, Total = 95
Birds: Introvert = 12, Extrovert = 18, Total = 30
Totals: Introvert = 89, Extrovert = 111, N = 200
Hypotheses
Null Hypothesis: There is no association between pet ownership and personality type.
Research Hypothesis: There is an association between pet ownership and personality type.
Pets and Personality Data:
Cats: Introvert = 55, Extrovert = 20, Total = 75
Dogs: Introvert = 22, Extrovert = 73, Total = 95
Birds: Introvert = 12, Extrovert = 18, Total = 30
Totals: Introvert = 89, Extrovert = 111, N = 200
Compute Chi Square
Expected frequency = (Row Total x Column Total) / N
Pets and Personality Data:
Cats: Introvert = (89 * 75) / 200, Extrovert = (111 * 75) / 200, Total = 75
Dogs: Introvert = (89 * 95) / 200, Extrovert = (111 * 95) / 200, Total = 95
Birds: Introvert = (89 * 30) / 200, Extrovert = (111 * 30) / 200, Total = 30
Totals: Introvert = 89, Extrovert = 111, N = 200
Compute Chi Square
Pets and Personality:
Data (Observed (Expected)):
Cats: Introvert = 55 (33.375), Extrovert = 20 (41.625), Total = 75
Dogs: Introvert = 22 (42.275), Extrovert = 73 (52.725), Total = 95
Birds: Introvert = 12 (13.35), Extrovert = 18 (16.65), Total = 30
Totals: Introvert = 89, Extrovert = 111, N = 200
Compute Chi Square in SPSS
Personality * Pet_Choice Crosstabulation:
Pet_Choice: Cat, Dog, Bird, Total
Personality: Introvert, Extrovert, Total
Chi-Square Tests:
Pearson Chi-Square = 43.013, df = 2, Asymptotic Significance (2-sided) = <.001
Likelihood Ratio = 44.642, df = 2, Asymptotic Significance (2-sided) = <.001
Linear-by-Linear Association = 22.415, df = 1, Asymptotic Significance (2-sided) = <.001
N of Valid Cases = 200
0 cells (0.0%) have expected count less than 5. The minimum expected count is 13.35.
Compute Effect Size
If two categories in either variable, use Phi. Otherwise, use Cramer’s V.
, where k is the smaller of the number of categories for each variable.
, which is a large effect.
Scientific Context
Methodological controls were adequate (participants were recruited on a college campus similar to other research of its kind).
Results are consistent with literature suggesting introverted people tend to prefer more solitary companion pets than extroverts.
We can have confidence in the results.
Summarize Results
There is an association between pet type and personality, \chi^2(2, N=200) = 43.01, p < .001, Cramer’s V = .46.
Introverts tended to prefer cats whereas extroverts tended to prefer dogs. There was no clear preference for birds.
Data:
Cats: Introvert = 55 (33.375), Extrovert = 20 (41.625), Total = 75
Dogs: Introvert = 22 (42.275), Extrovert = 73 (52.725), Total = 95
Birds: Introvert = 12 (13.35), Extrovert = 18 (16.65), Total = 30
Totals: Introvert = 89, Extrovert = 111, N = 200
NONPARAMETRIC ANALYSIS - SUMMARY
Used for nominal and/or ordinal data.
No “parameters” like mu and sigma, normality, etc.
Chi square is like a correlation: looking for a relationship or pattern.
Exploring patterns, not cause/effect.
Goodness of Fit: Relative Frequencies in different categories.
Independence: Looks for an association between two variables.
Popular Quiz
Q: A researcher asked participants, “Would you say you disapprove of increased government regulation of banks and major financial institutions?” Answers were “Approve”, “Disapprove” and “No Opinion”. What is the best statistic to analyze these data?
A: Chi Square Goodness of Fit
Q: Nominal and ordinal data are best analyzed using which CLASS of statistical test?
A: Nonparametric Statistics
Q: How many pairs of observed and expected frequencies will you compute when using a chi square?
A: As many pairs as there are categories in the study.
Q: If the summed relative differences (i.e., your calculated chi square value) is greater than the critical value, you would
A: Reject the null.
Q: The numerator of the chi-square computes the difference between the observed and expected frequencies. IF the null is true, this difference will likely be close to
A: 0
Q: IN THE TABLE BELOW, IS A CHI SQUARE TEST APPROPRIATE?
Vanilla : Observed = 4
Chocolate : Observed = 6
Strawberry : Observed = 2
Expected
A: NO
Q: In the above table, what is the expected frequency for FEMALES who are NOT VEGETARIAN?
Vegetarian: Male = 6, Female = 14
Not Vegetarian: Male = 24, Female = 16
A: 20
Which significance test and why?
Has voter satisfaction with a politician changed from what it was one year ago? Repeated t test
Do men enjoy violent movies more than women? Independent t test
Are socio-economic (i.e., below average, average, and above average) and political affiliation (Democrat, Independent, and Republican) associated? Chi square for independence
Do socio-economic (i.e., below average, average, and above average) and political affiliation (Democrat, Independent, and Republican) jointly effect attitudes about immigration (measured on a scale from 1 to 10)? 2-way ANOVA
Are the proportions of citizens in each socio-economic status category (i.e., far below average, below average, average, etc.) equal? Goodness of fit Chi square
Are the movie ratings different for Democrats, Independents, and Republicans? 1-way ANOVA
Okay, let's break down the key concepts from the note into simple terms to help you ace your exam. Here's a simplified overview:
Chi-Square Analysis: Think of this as a way to check if two things are related when you have opinions or categories (like 'yes' or 'no,' or 'cat person' vs. 'dog person'). It's not for finding cause and effect, just whether there's a pattern.
Types of Chi-Square Tests:
Goodness of Fit: Use this when you want to see if one thing (like opinions on a new product) fits a certain pattern or expectation.
Test of Independence: Use this when you want to see if two things (like pet preference and personality) are related to each other.
Key Assumptions for Chi-Square Tests:
Your categories need to be separate (like, each person only gets one vote).
You need counts (how many people said yes, how many said no).
Make sure you have enough people in each group (at least 5).
Hypotheses: You'll have a guess that there is no relationship (Null Hypothesis) and a guess that there is a relationship (Research Hypothesis).
Calculating Chi-Square: Compare what you expected to see with what you actually saw. A big difference means there's probably a relationship.
Effect Size: Tells you how strong the relationship is. For example, Cramer's V tells you how strong the association between the variables are.
Nonparametric Analysis: This is used when your
the difference between phi and cramers v and when to use them and why theyre use