Choosing the Suitable Statistical Test

Non-Experimental (Observational) vs. Experimental Studies

  • This lecture covers choosing the appropriate statistical test for data analysis.

  • Non-experimental (Observational) studies are contrasted with experimental studies.

Steps of Statistical Test Selection

  • After descriptive statistics, the process moves to analytical statistics.

  • Key analytical skills involve selecting the correct statistical test.

  • The goal is to answer the research question and decide whether to reject or fail to reject the null hypothesis.

Research Question and Hypothesis Testing

  • Research starts with an idea, leading to a research question.

  • A null hypothesis and an alternative hypothesis are formulated.

  • Sample data is analyzed to obtain a p-value.

  • If the p-value is less than a predefined alpha (α), the null hypothesis is rejected; otherwise, we fail to reject the null hypothesis.

  • The proper statistical test is crucial in this process.

Five Steps for Choosing a Statistical Test

1. Bivariate vs. Multivariable Analysis

  • Question 1: Is it a bivariate or multivariable analysis?

    • Bivariate analysis: studies the relationship between two variables.

    • Examples:

      • Age and height.

      • Type of treatment and complication.

      • Sex and smoking.

      • Smoking and coffee consumption.

    • Multivariable analysis (regression modeling/analysis):

    • Studies the effect of multiple variables on an outcome variable.

    • Examples:

      • Effect of smoking, sex, coffee consumption on blood pressure.

      • Effect of smoking, sex, coffee consumption on having a heart attack.

    • Note: Regression can be used for bivariate analysis if examining the effect of only one variable on the outcome.

2. Difference vs. Correlation (Bivariate Analysis)

  • Question 2: Are we studying a difference or a correlation (if bivariate)?

    • Difference: studying the difference between two or more groups or conditions.

    • Example:

      • The difference between males and females regarding coffee consumption.

      • The difference in body weight before and after being on a specific diet.

    • Correlation: studying the association between two variables.

    • Examples:

      • The association between age and weight.

      • The association between coffee consumption and the number of sleeping hours.

3. Independent vs. Paired Data (Bivariate Analysis)

  • Question 3: Are we working with independent or paired data (if bivariate)?

    • Independent (unpaired) data: observations in each sample are unrelated.

    • No relationship between subjects in each sample.

    • Subjects in one group cannot be in the other group.

    • No subject/group can influence the other.

    • Dependent (Paired) data: paired samples include:

    • Pre-test/post-test samples (a variable measured before and after an intervention).

    • Cross-over trials.

    • Matched samples.

    • When a variable is measured twice or more on the same individual.

4. Type of Outcome and Normality of Distribution

  • Question 4: Identify the types of data variables being studied.

    • The type of data variable is crucial for choosing the suitable test.

    • Types of Data:

    • Categorical: No unit.

      • Nominal: No order (e.g., colors, types of treatment).

      • Ordinal: Ordered (e.g., pain scale, satisfaction levels).

    • Numerical: Unit.

      • Discrete: Counted/integer (e.g., number of children).

      • Continuous: Measured/decimals (e.g., height, weight).

        • Time to event data (survival)

    • Normality of Distribution:

    • Determine if a numeric variable is normally distributed before certain statistical tests.

    • A histogram can visually represent the distribution.

5. Number of Groups/Conditions

  • Question 5: Are we comparing two groups/conditions or more than two?

    • Examples:

    • Comparing two groups: diseased vs. not diseased.

    • Comparing three groups: normal, osteopenia, osteoporosis.

    • Comparing two conditions: pre-test vs. post-test.

    • Comparing three conditions: before, during, after the operation.

Guide for Choosing Common Statistical Tests

  • A table provides guidance based on the answers to the five questions.

    • Q1: Bivariate/Multivariable.

    • Q2: Difference/Correlation.

    • Q3: Independent/Paired.

    • Q4: Type of outcome (and Normality).

    • Q5: No. of groups (conditions).

  • Statistical Tests based on the above questions:

    • Independent (un-paired):

    • Difference:

      • Continuous (Normal), 2 groups: Student's t-test

      • Continuous (Normal), >2 groups: One-way ANOVA

      • Continuous (Non-normal)/Ordinal,2 groups : Mann-Whitney U test

      • Continuous (Non-normal)/Ordinal, >2 groups: Kruskal-Wallis H test

      • Nominal, 2 groups: Chi-square test/ Fisher's exact test

      • Nominal, >2 groups: Chi-square test

      • Time to event (survival): Log-Rank test (Kaplan-Meier plot)

    • Dependent (paired):

    • Continuous (Normal), 2 groups: Paired t-test

    • Continuous (Normal), >2 groups: Repeated measured ANOVA

    • Continuous (Non-normal)/ Ordinal, 2 groups: Wilcoxon signed-rank test

    • Continuous (Non-normal)/ Ordinal, >2 groups: Friedman test

    • Nominal, 2 groups: McNemar's test

    • Correlation:

    • Continuous (Normal): Pearson's correlation (r)(r)

    • Continuous (Non-normal)/Ordinal: Spearman's correlation

    • Multivariable:

    • Nominal (2 levels): Spearman/Kappa (Agreement)

    • Continuous: Linear Regression

    • Ordinal: Ordered Logistic Regression

    • Nominal (2 levels): Binary Logistic Regression

    • Nominal (>2 levels): Multinomial Logistic Regression

    • Time to Event(survival): Cox Regression

    • Count variable: Poisson regression

Conclusion

  • The lecture concludes with acknowledgments to the source material.