Looks like no one added any tags here yet for you.
Why do you divide by the expected frequencies or probabilities in chi-squared tests?
In chi-squared tests, dividing by expected frequencies ensures that the comparison between observed and expected counts is fair. This adjustment compensates for differences in category sizes, making the comparison proportional and accurate, especially when dealing with smaller categories.
Why do the critical values for a chi-squared distribution get larger as the degrees of freedom get larger (as opposed to the t and F distributions, in which larger degrees of freedom yield smaller critical values)?
As degrees of freedom increase, chi-squared critical values rise because the distribution spreads out. This makes it harder to reject the null hypothesis unless there is a strong difference between observed and expected frequencies. The chi-squared distribution behaves differently from the t and F distributions, which narrow as degrees of freedom increase.
How do outliers affect the results of a t-test versus a chi-squared test versus a correlation and regression analysis?
Outliers can distort results in various statistical tests, like t-tests, regression, and chi-squared tests. They can influence means, correlations, and expected counts, leading to misleading conclusions. While chi-squared tests are less sensitive to outliers, they can still be affected if outliers significantly alter category counts.
Explain how overgeneralization can affect your predicted values in a regression.
Overgeneralization occurs in regression when predictions are made outside the range of the original data, leading to inaccurate results. This can happen when relationships observed within the data don't apply beyond it. Overfitting and ignoring external factors can also degrade the reliability of predictions.
Why do we need to test for linearity when conducting a correlation and regression analysis explain and discuss the concept of residuals in your answer.
In correlation and regression analysis, checking for linearity is essential. Residuals, or the differences between observed and predicted values, help assess whether the relationship is truly linear. Non-linear patterns in residuals suggest that a different model may be needed.
What are the similarities and/or differences between Pearson's r and Cohen's d?
Pearson's r measures the strength of a relationship between two variables, while Cohen's d quantifies the difference between two group means. Pearson's r focuses on relationships, while Cohen's d measures the size of differences between groups, making them suitable for different types of analysis.
Explain the similarities and/or differences between the chi-squared, t and F distributions.
The chi-squared, t, and F distributions all rely on degrees of freedom. The chi-squared and F distributions spread as degrees of freedom increase, while the t-distribution narrows. Each distribution serves different purposes: chi-squared tests categorical data, t-tests compare means, and F-tests compare variances.
What is the difference between an ANOVA and an independent samples t-test? Explain.
ANOVA is used to compare means across three or more groups, whereas an independent samples t-test compares two groups. ANOVA avoids the risk of multiple comparisons and false positives that would result from using multiple t-tests. Post-hoc tests are needed after ANOVA to identify specific group differences.
Explain why F = 1 in an ANOVA when the null hypothesis is true.
When the null hypothesis is true, the F-ratio in ANOVA is close to 1 because the between-group and within-group variabilities are similar. A value greater than 1 indicates that between-group variability is larger than expected, suggesting significant differences between group means.
Why do we need to test for homogeneity of variances when conducting an ANOVA?
Homogeneity of variances in ANOVA checks if the variability within groups is equal. If the variances are unequal, it can affect the F-ratio, leading to incorrect conclusions. Tests like Levene's and Bartlett's are used to check for equal variances, and Welch's ANOVA can be used if variances are unequal.
Describe the two ways in which you estimate the population variance in an ANOVA. In what circumstances are these estimates biased versus unbiased?
ANOVA uses two estimates of population variance: the between-groups estimate and the within-groups estimate. The between-groups estimate reflects differences between group means and can be biased if the null hypothesis is false, while the within-groups estimate is unbiased and reflects random variation.
Are ANOVAs one-tailed or two-tailed tests? Explain.
ANOVA is a one-tailed test because it only looks for large differences between group means. The F-ratio, which is always positive, checks if between-group variability exceeds within-group variability, suggesting significant differences. This one-tailed approach focuses on larger-than-expected differences.