Final_Exam_Review_Questions

What are the sources of variance in One-way ANOVA?

Between groups
Within groups

What are the sources of variance in Two-way ANOVA?

Main effects of each factor
Interaction effect between factors
Error variance

What is the Total Sum of Squares?

Represents the total variance in the data, encompassing all sources of variation (between and within groups).

Larger SS = higher variance
Larger difference between groups = more likely a significant difference will be found

What test is used for assessing heterogeneity of variance?

Levene's Test to assess equality of variance among groups.
If the test is significant, variances are different across groups

How is degrees of freedom calculated in One-way ANOVA?

df = number of groups - 1

How is the F Statistic calculated?

F statistic: (mean between groups)² / (mean within groups)²
Larger F ratio = small p value = reject H0
Smaller F ratio = larger p value = fail to reject H0

What are the three questions asked in Two-way design?

What is the main effects of variable A?
What is the main effects of variable B?
What are the interactions between variables A and B?

How is a graph of interaction represented?

Non-parallel lines indicate interaction; the effect of one factor differs at levels of the other.

What is the purpose of a Multiple Comparison Test?

To determine which specific group means are significantly different when ANOVA shows significant results.

What are the error rates in ANOVA?

Per Comparison Error Rate: Probability of a Type I error for each individual test.
Familywise Error Rate: Probability of a Type I error in a set of comparisons. This will increase as the number of comparisons increases

What is the difference between post hoc and planned comparisons?

Post Hoc Comparisons: Conducted following a significant ANOVA
Planned Comparisons: Made based on hypotheses established prior to data collection, whether or not the ANOVA was significant

What is the Minimum Significant Difference (MSD)?

The smallest difference between group means that can be considered statistically significant.

What tests are considered the most conservative and most liberal?

Most Conservative Tests: Scheffé’s test.
Most Liberal Tests: Fisher’s Least Significant Difference.

What is the meaning of differences in conservative and liberal tests?

A more liberal test will find more significant differences than a more conservative test for the same data.

What is a Simple Effect in Factorial Design?

A separate analysis of each row or column within a factorial design.

What is score ranking?

Score Ranking: List of scores to rank: 7, 9, 10, 10, 14, 15, 15, 15, 20, 25, 36, 41, 43, 43, 50.

What are the nonparametric tests analogous to t and ANOVA?

Unpaired t-test: Mann-Whitney U Test.
Paired t-test: Wilcoxon Signed-Rank Test.
One-Way ANOVA: Kruskal-Wallis ANOVA by ranks
One-Way Repeated Measures ANOVA: Friedman two-way ANOVA by ranks

Why are nonparametric tests used?

Non-normal data distribution, ordinal data, or variances not equal

What tests are used for normal distribution?

Tests for Normal Distribution: Shapiro-Wilk test, Kolmogorov-Smirnov test.

What is the purpose of the Chi-Square Test?

Purpose: To evaluate if there are significant differences between observed and expected frequencies in categorical data.

What are the assumptions of the Chi-Square Test?

Observations must be independent.
Expected frequencies must be sufficiently large (typically at least 5).

What is the purpose of the Goodness-of-Fit Test?

To see if observed categorical data fits a particular distribution.

What is a Standardized Residual in Chi-Square?

Definition: The difference between observed and expected frequencies, standardized to evaluate significance.
- Closer to 2.0 = more contribution to chi-squared

What is Test of Independence with Chi-Square?

Definition: Examines if two categorical variables are independent.

What type of data presentation is used for Test of Independence?

Table Used: Contingency table.

How is degrees of freedom calculated in Test of Independence?

df = (Rows - 1)(Columns - 1)

How is correlation reflected in a scatterplot?

Points clustered around a straight line indicate a strong correlation, while widely dispersed points indicate a weak correlation.

What is the range of the Relative Reliability Coefficient?

Range: -1 to 1.

How is the reliability coefficient interpreted?

Interpretation of -.89: Strong negative correlation; as one variable increases, the other decreases.

What is the Effect Size Index for Pearson Correlation?

The correlation coefficient, r

What statistics are used for ordinal scores correlation?

Spearman's rank correlation and the Kendall tau.

What is the purpose of Regression Analysis?

Purpose: To predict the value of a dependent variable based on one or more independent variables.

What is the Coefficient of Determination?

r² : the fraction of variance in the dependent variable explained by the independent variable(s).

What is the Regression Line Equation?

Equation: Y = a + bX.
- Where:
  - Y: Predicted value (dependent value)
  - a: Y-intercept (regression constant)
  - b: Slope (regression coefficient)
  - X: Independent variable value

What does the Regression Line Representation mean?

It is the line of best fit to the data, with the smallest residuals or errors in predicting Y, represents the average relationship between independent and dependent variables.

What are Residuals in Regression Analysis?

Definition: Difference between observed values and predicted values.

What value is used for prediction accuracy in Multiple Regression?

Value Used: R² (coefficient of determination).

What is the purpose of Beta Weights?

Purpose: Indicate the strength and direction of the effect of each predictor on the dependent variable.

What are Collinearity Issues in Multiple Regression?

Collinearity occurs if the independent predictor variables are correlated with each other, creating a situation where some variables may look less important to prediction, but only because they correlated with another variable, thereby being redundant in the explanation of variance.

What is a Dummy Variable?

Definition: A binary variable representing categorical data; typically coded as 0 or 1.

What is the difference between Logistic Regression and Linear Regression?

They both use independent (X) variables to explain Y, but in logistic regression Y is a dichotomous categorical variable, representing the presence or absence of condition or group membership. The result of regression is a predicted value. The result of logistic regression is a probability value related to the likelihood of an individual belonging to one of the outcome categories. Logistic regression also provides estimates of odds ratios for each of the independent variables.

What are the outcome categories in Logistic Regression?

The outcome is coded 0 for the reference group and 1 for the target group. The target group typically
represents the group with the adverse outcome.

What is the Odds Ratio?

The ratio of the odds of an event occurring in one group to the odds of it occurring in another group.