Notes on Statistical Analysis Concepts

Differences vs. Relationships

  • Focus on distinguishing between differences and relationships in statistical analysis.
    • Differences: Comparison between groups, e.g., satisfaction of faculty in different departments.
    • Relationships: Exploring how variables interact, e.g., the relationship between exam scores and studying hours.

Correlation

  • Definition: A statistical procedure that describes the strength and direction of the relationship between two variables (Privitera, 2012).

    • Indicates if variables change together:
    • Positive Correlation: Both variables increase/decrease together.
    • Negative Correlation: One variable increases while the other decreases.
  • Uses:

    1. Descriptive: To outline how two variables change together.
    2. Inferential: To determine if observed patterns in a sample are indicative of the population.

Correlation Coefficient

  • Definition: A quantifiable measure of the relationship between two variables.

  • Uses:

    1. Measures strength and direction of the relationship (Privitera, 2012).
    2. Helps ascertain if sample patterns hold true in the population.
  • Types of Correlation Coefficients:

    • Pearson’s r: For linear relationships.
    • Spearman’s rho: For ranked data.
    • Point bi-serial: For non-continuous variable and continuous variable.
    • Chi-square: For categorical data.

Important Note on Correlation

  • Correlation does not imply causation: A relationship between variables does not mean that one variable causes the other to change.

Pearson's r Product-Moment Correlation Coefficient

  • Definition: A measure of the linear relationship between two interval or ratio scaled variables.

Assumptions for Pearson's r

  1. Linearity: Relationship should be straight-line in nature.
  2. Normality: Data points must be normally distributed.
  3. Bivariate Normal Distribution: When plotted, the two variable data should form a normal distribution.

Interpreting Pearson's r

  • Sign & Magnitude:
    • Positive (+): Both variables increase/decrease together.
    • Negative (-): One variable increases while the other decreases.
  • Magnitude interpretation:
    • |r| between 0.0 and 0.1: Little or no relationship
    • |r| between 0.1 and 0.3: Weak relationship
    • |r| between 0.3 and 0.5: Moderate relationship
    • |r| above 0.5: Strong relationship

Testing Hypotheses

  1. Null Hypothesis (H0): No linear relationship between variables.
  2. Alternative Hypothesis (H1): There is a linear relationship.
    • Directional: Specifies the nature of the relationship (positive or negative).
    • Non-directional: Simply states there is a relationship without specifying direction.

Sample Collection and Data Analysis

  • Steps include:
    1. Selecting a sample.
    2. Collecting data.
    3. Determining regions of rejection based on significance level.
    4. Calculating test statistics.
    5. Making a statistical decision based on the results.
    6. Providing interpretation including correlation strength, p-value, and degrees of freedom.

Caution on Pearson's r

  • Non-linear Relationships: If relationship is not linear, Pearson's correlation coefficient may not be the best fit to describe the relationship.

Errors in Hypothesis Testing

  1. Type I Error (False Positive): Rejecting a true null hypothesis.
  2. Type II Error (False Negative): Retaining a false null hypothesis.
    • Alpha (α): The probability of making a Type I Error.
    • Beta (β): The probability of making a Type II Error.
    • Power of a Test: Probability of correctly rejecting a false null hypothesis.

Effect Size

  • Effect size quantifies the size of the difference or strength of a relationship.
    • Cohen’s d: Measures effect in terms of standard deviations.
    • Interpretations:
      • Small effect: d < 0.2
      • Medium effect: 0.2 < d < 0.8
      • Large effect: d > 0.8

Confidence Intervals

  • Definition: A range of values derived from a data set that is likely to contain the value of an unknown population parameter.
  • Typically expressed at a confidence level (e.g., 95%).
  • Provides an interval estimate rather than a single point estimate.

Conclusion

  • Understanding statistical methods including correlation, hypothesis testing, and effect size is critical for interpreting research findings and making informed conclusions. Statistical concepts provide frameworks for establishing significant results and assessing the implications of the findings.