PSYB07H3 - Final Exam Prompts
In a chi-squared test, dividing by the expected frequencies standardizes the differences between observed and expected values. This accounts for the size of the expected frequencies, ensuring the test statistic isn't biased by large or small expected counts. Without this division, differences in categories with higher expected frequencies would disproportionately influence the test statistic.
Chi-squared critical values grow with degrees of freedom because the chi-squared distribution becomes more spread out as the number of categories increases. Higher degrees of freedom mean there are more independent comparisons, so the threshold for significance must increase to maintain the same significance level (e.g., Ī±=0.05\alpha = 0.05Ī±=0.05).
In contrast, for ttt and FFF-distributions, critical values decrease with larger sample sizes (or degrees of freedom) because these distributions approach the normal distribution, where smaller variability is expected in sampling.
t-Test: Outliers can inflate the standard deviation, reducing statistical power and potentially masking true differences. Alternatively, they can create false significance if they heavily skew the mean.
Chi-Squared Test: Outliers are less relevant since chi-squared tests rely on categorical data and frequency counts, but extreme discrepancies in observed vs. expected values may distort the test statistic.
Correlation/Regression: Outliers can strongly affect the slope and correlation coefficient, exaggerating or masking relationships. Residuals will show these deviations.
Overgeneralization occurs when a regression model is used to predict values beyond the range of the observed data (extrapolation). The model assumes the same linear relationship holds outside the data range, which can lead to inaccurate predictions if the actual relationship changes or becomes non-linear.
Testing for linearity ensures that the assumption of a linear relationship between variables is valid. If the relationship is non-linear, the correlation coefficient (Pearsonās rrr) or regression model may misrepresent the data.
Residuals: These are the differences between observed and predicted values. Examining residual plots helps identify non-linearity, as non-random patterns in residuals indicate violations of the linearity assumption.
Similarities: Both measure effect sizes, providing standardized metrics to quantify the strength of relationships or differences.
Differences:
Pearsonās rrr: Measures the strength and direction of a linear relationship between two continuous variables (ā1ā¤rā¤1-1 \leq r \leq 1ā1ā¤rā¤1).
Cohenās ddd: Measures the standardized mean difference between two groups, focusing on magnitude rather than relationship.
Similarities: All three are sampling distributions used in hypothesis testing and depend on degrees of freedom.
Differences:
Chi-squared: Used for categorical data, always positive, and asymmetric.
t: Symmetric and used for comparing means in small samples.
F: Asymmetric, used in variance analysis, and calculated as the ratio of two variances.
Independent samples t-test: Compares the means of two groups.
ANOVA: Compares the means of three or more groups. While the t-test is limited to two groups, ANOVA generalizes to multiple groups by analyzing variance.
Under the null hypothesis, the between-group variance (systematic variance) equals the within-group variance (random error). Since F=Between-groupĀ varianceWithin-groupĀ varianceF = \frac{\text{Between-group variance}}{\text{Within-group variance}}F=Within-groupĀ varianceBetween-groupĀ varianceā, the FFF-ratio equals 1.
Homogeneity of variances ensures that the groups being compared have similar variability. Violations can lead to misleading FFF-ratios, as the test assumes equal variance to partition variance correctly between and within groups.
Between-group variance: Based on the variability of group means relative to the overall mean. Biased if the null hypothesis is false.
Within-group variance: Based on the variability of individual scores within each group. Generally unbiased.
ANOVAs are inherently two-tailed because they test for any difference among group means, regardless of direction. The test statistic only considers variance, not the sign of differences.
Parametric tests: Assume normal distribution and specific conditions (e.g., homogeneity of variances, interval/ratio data). Appropriate when assumptions are met.
Non-parametric tests: Do not assume normality and are suitable for ordinal data or when assumptions of parametric tests are violated.
In a chi-squared test, dividing by the expected frequencies standardizes the differences between observed and expected values. This accounts for the size of the expected frequencies, ensuring the test statistic isn't biased by large or small expected counts. Without this division, differences in categories with higher expected frequencies would disproportionately influence the test statistic.
Chi-squared critical values grow with degrees of freedom because the chi-squared distribution becomes more spread out as the number of categories increases. Higher degrees of freedom mean there are more independent comparisons, so the threshold for significance must increase to maintain the same significance level (e.g., Ī±=0.05\alpha = 0.05Ī±=0.05).
In contrast, for ttt and FFF-distributions, critical values decrease with larger sample sizes (or degrees of freedom) because these distributions approach the normal distribution, where smaller variability is expected in sampling.
t-Test: Outliers can inflate the standard deviation, reducing statistical power and potentially masking true differences. Alternatively, they can create false significance if they heavily skew the mean.
Chi-Squared Test: Outliers are less relevant since chi-squared tests rely on categorical data and frequency counts, but extreme discrepancies in observed vs. expected values may distort the test statistic.
Correlation/Regression: Outliers can strongly affect the slope and correlation coefficient, exaggerating or masking relationships. Residuals will show these deviations.
Overgeneralization occurs when a regression model is used to predict values beyond the range of the observed data (extrapolation). The model assumes the same linear relationship holds outside the data range, which can lead to inaccurate predictions if the actual relationship changes or becomes non-linear.
Testing for linearity ensures that the assumption of a linear relationship between variables is valid. If the relationship is non-linear, the correlation coefficient (Pearsonās rrr) or regression model may misrepresent the data.
Residuals: These are the differences between observed and predicted values. Examining residual plots helps identify non-linearity, as non-random patterns in residuals indicate violations of the linearity assumption.
Similarities: Both measure effect sizes, providing standardized metrics to quantify the strength of relationships or differences.
Differences:
Pearsonās rrr: Measures the strength and direction of a linear relationship between two continuous variables (ā1ā¤rā¤1-1 \leq r \leq 1ā1ā¤rā¤1).
Cohenās ddd: Measures the standardized mean difference between two groups, focusing on magnitude rather than relationship.
Similarities: All three are sampling distributions used in hypothesis testing and depend on degrees of freedom.
Differences:
Chi-squared: Used for categorical data, always positive, and asymmetric.
t: Symmetric and used for comparing means in small samples.
F: Asymmetric, used in variance analysis, and calculated as the ratio of two variances.
Independent samples t-test: Compares the means of two groups.
ANOVA: Compares the means of three or more groups. While the t-test is limited to two groups, ANOVA generalizes to multiple groups by analyzing variance.
Under the null hypothesis, the between-group variance (systematic variance) equals the within-group variance (random error). Since F=Between-groupĀ varianceWithin-groupĀ varianceF = \frac{\text{Between-group variance}}{\text{Within-group variance}}F=Within-groupĀ varianceBetween-groupĀ varianceā, the FFF-ratio equals 1.
Homogeneity of variances ensures that the groups being compared have similar variability. Violations can lead to misleading FFF-ratios, as the test assumes equal variance to partition variance correctly between and within groups.
Between-group variance: Based on the variability of group means relative to the overall mean. Biased if the null hypothesis is false.
Within-group variance: Based on the variability of individual scores within each group. Generally unbiased.
ANOVAs are inherently two-tailed because they test for any difference among group means, regardless of direction. The test statistic only considers variance, not the sign of differences.
Parametric tests: Assume normal distribution and specific conditions (e.g., homogeneity of variances, interval/ratio data). Appropriate when assumptions are met.
Non-parametric tests: Do not assume normality and are suitable for ordinal data or when assumptions of parametric tests are violated.