Chapter 9: Categorical data analysis
Chapter 9: Categorical data analysis
- Categorical date analysis: refers to a collection of tools that you can use when your data are nominal scale
The x2 (chi-square) goodness of fit model
- It tests whether an observed frequency distribution of a nominal variable matches an expected frequency distribution
- A goodness-of-fit test could be used to determine whether the number in each category match the numbers what would be expected given the standard treatment option
- The cards data
- Try as we might to act random, we think in terms of patterns and structure and so, when asked to do something at random, what people actually do is anything but random
- Ask people to pick two random cards from a deck (mentally). Look at the data and figure out whether or not the cards that people pretended to select were really random
- The null hypothesis and the alternative hypothesis
- Null hypothesis corresponds to a vector of probabilities in which all the probabilities are equal to one another
- The most common use of the goodness of fit test is to test a null hypothesis that all the categories are equally likely
- Alternative hypothesis is demonstrating that the probabilities involved aren't all identical
- The goodness of fit test statistic
- We have our observed frequencies and a collection of probabilities corresponding to the null hypothesis that we want to test
- Expected frequencies
- In order to convert this into a useful test statistic, one thing we could do it just add these numbers up. This is called the goodness-of-fit statistic
- The sampling distribution of the GOF statistic
- To determine whether or not a particular value of x2 is large enough to justify rejecting the null hypothesis, were going to need to figure out what the sampling distribution of x2 would be if the null hypothesis was true
- The number of things you're adding up is k
- You are suppose to be looking at the number of genuinely independent things that are getting added together
- Degrees of freedom
- Calculate it by counting up the number of distinct quantities that are used to describe your data then subtracting off all of the constraints that those data must satisfy
- K-1
- Testing the null hypothesis
- The final step in the process of constructing our hypothesis test is to figure out what the reject region is
- What values of x2 would lead us to reject the null hypothesis
- The chi-squared goodness-of-fit test is always a one sided test
- If our calculated x2 statistic is bigger than the critical value, then we can reject the null hypothesis
The X2 test of independence
- Constructing our hypothesis test
The continuity correction
- Chance that you need to make to your calculations whenever you only have 1 degrees of freedom. Called the continuity correction or sometimes the Yates correction
- The main reason for this is that the true sampling distribution for the X2 statistic is actually discrete but the X2 distribution is continuous
- Systematic problems
- Specially when N is small and when df=1, the goodness of fit statistic tends to be too big meaning that you actually have a bigger alpha value then you think
Assumptions of the test
- All statistical tests make assumptions
- For the Chi-square tests, the assumptions are
- Expected frequencies are sufficiently large
- Data are independent of one another
Summary
- The X2 goodness of fit is used when you have a table of observed frequencies of different categories, and the null hypothesis gives you a set of known probabilities to compare to them
- X2 test of independence is used when you have a contingency table of two categorical variables. The null hypothesis is that there is no relationship or association between variables
- Effect size for a contingency table can be measured in several ways -- Cramer's V statistic
Chapter 10: comparing two means
One-sample z-test
- The inference problem that the test addresses
- The this scenario the research hypothesis relates to the population mean for the psychology students grades
- Constructing the hypothesis test
- Null and alternative hypothesis
- We could look at the difference between the sample mean and the value that the null hypothesis predicts for the population mean
- If the quantity equals or is very close to 0, things are looking good for the null hypothesis.
- Z score is always equal to the number of standard errors that separate the observed sample mean from the population mean predicted by the null hypothesis
- Assumptions of the z-test
- Normality: the z-test assumes that the true population distribution is normal
- Independence: the observations in your data set are not correlated with each other, or related to each other in some funny way
- Known standard deviation: the true standard deviation of the population is known to the researcher
One sample T-test
- Because we are not relying on an estimate of the population standard deviation we need to make some adjustment for the fact that we have some uncertainty about what the true population standard deviation actually is
- Introducing the t test
- Full name of the t test is the Students t-test
- How we should accommodate the fact that we aren't completely sure what the true standard deviation is
- T-distribution is very similar to the normal distribution but has heavier tails
- Assumptions of the one sample t-test
- Normality: assuming that the population distribution is normal
- Independence: assume that the observations in our sample are generated independently of one another
The independent samples t-test
- Two groups of observations (different)
- Introducing the test
- Two forms
- Students
- Simpler, but relies on much more restrictive assumptions
- Welch's
- Assuming you want to run a two-sided test, the goal is to determine whether two intendent samples of data are drawn from populations with the sample mean or different means
- If we have an experimental design where participants are randomly allocated to one of two groups and we want to compare the two groups mean performance on some outcome measure, then an independent samples t-test is what were after
- T-statistic
- Pooled estimate of the standard deviation
- In a student t-test, we make the assumption that the two groups have the same population standard deviation
- What we do is take the weighed average of the variance estimates, which we use as our pooled estimate of the variance
- The weight assigned to each sample is equal to the number of observations in that sample minus 1
- Completing the test
- We have our pooled estimate of the standard deviation
- Interested in the difference between the two means
- The standard error that we need to divide by is in fact the standard error of the difference between means
- Assumptions of the test
- Normality
- Independence
- Homogeneity of variance (homoscedasticity): population standard deviation is the same in both groups. Use the Levene test
In independent samples t-test (welch test)
- Doesn’t rely on the homogeneity assumption
- T-statistic Is calculated in a much similar way
- Main difference is that the standard error calculations are different
- Assumptions
- Normality
- Independence
The paired samples t-test
- Each participant appears in both groups
- If we were to try to do an independent samples t-test, we would be conflating the within subject differences with the between subject variability
Effect size
- Most commonly used measure of effect size for a t-test is a Cohens d
Checking the normality of a sample
- One way to check whether a sample violates the normality assumption is to draw a QQ plot
- Allows you to visually check whether your seeing any systematic violations
- Each QQ plot is plotted as a single dot
- Shapiro-Wilk tests
- The null hypothesis being tested is that a set of N observations is normally distributed
Testing non-normal data with Wilcoxon tests
- Nonparametric tests.
- The wilcoxon test is usually less powerful than the t-test
- Two sample Mann-Whitney U test
- As long as there are no lies, than the test that we want to do is surprisingly simple
- Construct a table that compares every observation in group A Against every observation in group B
- One sample Wilcoxon test
- There is no fundamental difference between doing a paired samples test using before and after
Chapter 12 comparing several means (one-way ANOVA)
- Concerned with investigating differences in means
How ANOVA works
- Null hypothesis and alternative hypothesis
- Why an analysis of variances will help learn anything useful about the means. This is one of the biggest conceptual difficulties that people have when first encountering ANVOA
- Variances to sum of squares
- Instead of the squared deviations, we just add them
- Within group sum of squares - see how difference each individual person is from their group mean
- In order to quantify the extent of this variation, what we do it calculate the between group sum of squares
- From sum of squares to the F-test
- In order to convert our SS value into an F-ratio, the first ting we need to calculate is the degrees of freedom associated with the SSb and SSw values
- Bigger values of F means that the between groups variation is large relative to the within groups variation
Effect size
- There are a few different ways you could measure the effect size in an ANOVA, but the most commonly used measures are n2 (eta squared)
Multiple comparisons and post Hoc tests
- Corrections for multiple testing
- Theory free search for group differences is referred to as post hoc analysis
- The usual solution to this problem is to introduce an adjustment to the p-value, which aims to control the total error rate across the family of tests - often referred to as a correction for multiple comparisons
- Bonferroni correction
- Simplest of these adjustments is called the Bonferroni correction
- I want to ensure that the total probability of making any type 1 error at all is at most alpha, if so then the Bonferroni correction just says " multiply all your raw p-values by m"
- Holm corrections
- Often used is Holm correction
- To pretend that your doing the tests sequentially, starting with the smallest raw p-value and moving onto the largest one
Assumptions of one-way ANOVA
- Homogeneity of variance: we've only got the one value for the population standard deviation rather than allowing each group tp have its own value
- Assumes that the population standard deviation is the same for all groups
- Normality
- Independence: knowing one residual tells you nothing about any other residual
Checking the homogeneity of variance assumption
- Levene test
- Brown-forsythe test
Removing the normality assumption
- Kruskal-Wallis rank sum test
- When you’ve got three or more groups