1/56
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
VIDEO TAKEAWAYS
0
These Chi-square Tests require
specification of
categories.
Data that is analyzed is in
count or frequency form.
We will look at two applications:
• Goodness-of-Fit Tests.
• Contingency Tests.
Given the hypothesis,
is there evidence that what
we are observing in the sample is incompatible with
what we expect?
Observed counts from
sample data – one for each
category
Expected counts calculate as
np – one for each
category
Create test statistic:
x² = (o-e)² / e for all categories
Large values of the test statistic will provide
evidence
against the null hypothesis.
REC
0
This is a class of tests for
categorically represented data.
The tests are based on a comparison of
what is actually
observed in categories to what is theoretically expected to occur in those categories under the condition set
forth in the null hypothesis.
“If the null is true, what
SHOULD happen?”
Test of Independence to determine if
two categorical variables are independent of one another (one
population and one sample):
Test of Independence Ex
H0: “Variable A” and “Variable B” are independent of (or NOT associated with) each other
HA: “Variable A” and “Variable B” are NOT independent of (or are associated with) each other
Test of Independence Ex 2
H0: Preferred mobile phone carrier is independent of gender
HA: Preferred mobile phone carrier is not independent of gender
Test of Homogeneity to determine if
the probability distribution for categories of a variable is the same
for multiple populations (multiple populations with a sample independently drawn from each)
Test of Homogeneity Ex
H0: The probabilities of the different categories occurring is the same in all populations being compared.
HA: The probabilities of the different categories occurring is NOT the same in all populations being
compared.
Test of Homogeneity Ex 2
H0: The proportion of blue candies is the same for Plain, Peanut Butter, and Peanut M&M candies
HA: The proportion of blue candies is not the same for Plain, Peanut Butter, and Peanut M&M candies
Test of Goodness-of-Fit to determine if
observed values of a single variable can be accurately modeled
with a theoretical probability distribution:
Test of Goodness of Fit Ex
H0: “Variable of interest” is accurately modeled with “description of distribution”
HA: “Variable of interest” is not accurately modeled with “description of distribution”
Test of Goodness of Fit Ex 2
H0: Grades earned by students in BUSOBA 2320 can be modeled with a Normal distribution
HA: Grades earned by students in BUSOBA 2320 cannot be modeled with a Normal distribution
Test Statistic Equation
𝜒2 = ∑ (𝑂𝑏𝑠−𝐸𝑥𝑝)^2 / Exp
Obs =
observed sample count or frequency in a category
Exp =
np = the theoretical expected count in a category if the null hypothesis is true
n =
sample size
p =
probability of being in the category if the null hypothesis is true
Reject the null hypothesis if the test statistic is
significantly large and P-value significantly small.
Chi-square distribution has
1 parameter, df, which depends on the type of test
Independence and Homogeneity tests have
df = (r – 1)(c – 1)
r =
# categories in the row variable
c =
# categories in the column variable
Goodness-of-fit tests have
df = k – m – 1
k =
number of categories
m =
number of parameters for the hypothesized distribution
that are estimated with the sample data
Curve is
right tailed. (Big part on the left)
Required Data Conditions PART 1
▪ Count data (refers to the actual number of times something happened)
▪ SRS
Required Data Conditions PART 2
Expected counts ≥ 5
n ≥ 30 is recommended for GoF
Expected counts ≥ 5: This is a
rule of thumb; use accordingly. A single cell with expected count less than this will not necessarily invalidate the test results.
Expected counts ≥ 5: A single expected count < 5 that also has an extreme residual is
cause for concern.
Expected counts ≥ 5: Multiple” expected counts < 5
invalidate the results.
MATH
0
Critical Number is found using
df, a and chi square table. Numbers in the middle of the table are the critical numbers.
Test Statistic =
x^2 obs = (x^2 = summation equation)
Numbers on first row of chi square table are
a or p values.
On the chi square table numbers (or a / p value) are from
GREATEST to smallest. This changes the meaning of a number between two other numbers.
a Chi-Square Contingency Table is a
grid used to see if two different categories (like "Sport Preference" and "Age Group") are actually related or if the results are just a random coincidence. GA
"Test of Association" is often used as a
general umbrella term for independence and homogeneity. GA
“If a significant association exists, use the standardized residuals to interpret/explain the association.”
Look at each of the counts, compare them, and focus on the ones that stand out. Then, look at the standardized residuals and find which of the two (ex. male or female) has the greatest number, which has the greatest impact on association.
Standardized Residuals tell us
exactly which specific cells in the table are causing the association by being much higher or lower than expected. GA
If the number is greater than 2 or less than -2, it is considered a
"significant" contributor to the overall association. GA
GA Standardized Chi Square Residual
Between -2 and +2:
This cell is "boring." The difference between what you saw and what you expected is probably just due to random chance. It didn't contribute much to the Chi-Square result.
GA Standardized Chi Square Residual Above +2.0:
You have significantly more people in this cell than expected. This group is "over-represented."
GA Standardized Chi Square Residual Below -2.0:
You have significantly fewer people in this cell than expected. This group is "under-represented."
m
The number of estimated parameters. Ex. Since you estimated both the mean and the standard deviation, _ would be 2.
Uniform means
all p will be found by doing 100/k. They are all the same.
Frequency on graph is
observed.