Chi square goodness of fit
tests if a distribution of one variable matches expected distribution
Null hypothesis for chi square goodness of fit
the observed distribution matches what is expected
alternate hypothesis for chi square goodness of fit
the observed distribution does not match what is expected
degrees of freedom of chi square goodness of fit
(number of categories)-1
Assumptions for chi square goodness of fit
-all expected counts are over 5 -random sampling -independent observations
formula
(observed-expected)^2/expected
contingency tables
look at the bivariable relationship between 2 categorical variables
marginal distriubtion
looks at the probability of events happening for only one of the variables ignoring the other one -always sums to 100%
conditional probabilities
focus on the probability of randomly selecting someone with certain characteristics from their group
Chi-square test of independence
are two categorical variables independent of one another
bivariate relationship
when there are 2 categorical variables we can make a bar chart or a pie chart to compare the conditional distributions to see if they differ
assumptions of chi square test of independence
-random sampling -independent observations -expected counts are over 5
null hypothesis for chi square test of independence
x is independent of y
alternate hypothesis for chi square test of independence
x is not independent of y
first step to analyze chi square of independence
fill in contingency table with observed values
second step to analyze chi square of independence
determine expected count (this is equal to marginal distribution)
third step to analyze chi square of independence
calculate chi square statistic
degrees of freedom for chi square test of independence
(rows-1)(columns-1) -this excludes the total column
to calculate marginal distribution
((row total)(column total))/grand total
probability notatoin
p(x|reference group)