Quantitative Methods: Nonparametric Statistics

Introduction to Nonparametric Statistics

This final set of lectures on quantitative methods in PSY 02/1941 introduces nonparametric statistics, a crucial alternative when data assumptions for traditional statistical tests are violated. Many common statistical tests, known as parametric tests, assume that data are normally distributed. While this is often true for psychological assessments, there are situations where data are not normally distributed, rendering parametric tools less useful or invalid. In such cases, nonparametric tests become necessary.

Lecture Learning Outcomes

This week's lectures aim to equip students with the ability to:

Differentiate between parametric and nonparametric statistical tests and understand when each is appropriate.
Understand and apply the Spearman correlation analysis, the nonparametric equivalent of the Pearson correlation.
Understand and apply the Wilcoxon signed-rank test, the nonparametric equivalent of the one-sample t-test.
Understand and apply the Wilcoxon rank-sum test, the nonparametric equivalent of the independent-samples t-test.

Overview of Nonparametric Statistics

Null Hypothesis Significance Testing (NHST) and Assumptions

In inferential statistics, Null Hypothesis Significance Testing (NHST) involves assuming the null hypothesis is true and then assessing the probability of observing our sample data under this assumption. If the data are highly surprising (i.e., improbable), we reject the null hypothesis. To correctly determine how "surprised" we should be, we need to know the expected distribution of sample data if the null hypothesis were true. For many common statistical tests, such as Pearson correlation and various t-tests, calculating this expected distribution and, crucially, the p-value, relies on specific assumptions about the data.

Validity of p-values

The p-value, which quantifies our surprise, is only valid if the underlying assumptions of the statistical test are met or not significantly violated. If assumptions are violated, the p-value becomes unreliable. A significant p-value in such a scenario might not indicate a false null hypothesis but rather the violation of another assumption. One of the most frequent assumptions is that the data are normally distributed.

Parametric vs. Nonparametric Tests

Parametric Statistical Tests: These tests assume that data are drawn from a specific probability distribution, typically the normal distribution. The term "parametric" refers to the estimation of parameters of this assumed distribution (e.g., the mean and standard deviation of a normal distribution). The validity of these tests, including the p-values they yield, hinges on these distributional assumptions being met.
Nonparametric Statistical Tests: These tests do not make strong assumptions about the underlying distribution of the data. They are therefore suitable for situations where data are not normally distributed. The term "nonparametric" reflects that they do not involve estimating parameters of a specific fixed distribution. These tests employ alternative methods to calculate p-values, which will be explored in subsequent lectures.

Assessing Data Distribution

Visual Inspection with Histograms

Histograms provide a visual representation of data distribution. A key assumption for parametric tests is that the histogram of the data should approximate a normal, bell-shaped curve.

Example 1: Normally Distributed Data (Extroversion)

Data sampled from participants' extroversion scores often show a rough normal distribution. While not perfectly bell-shaped due to sampling noise (e.g., 100 samples), it exhibits a general bell-like form, with the mean and median roughly centered and a symmetrical spread. This is considered "roughly normally distributed" and generally acceptable for parametric tests.

Example 2: Non-Normally Distributed Data (Anxiety Symptoms - Positive Skew)

Anxiety symptom severity in the general population typically displays a strong positive skew. This means the distribution is asymmetrical, with a peak at lower symptom levels (most people have low or no anxiety) and a long tail extending to the right (few people have severe symptoms). In such cases, the mean and median are pulled towards the lower end, and the data clearly deviate from a bell shape. Using parametric tests like Pearson correlation on such data would violate the normality assumption.

Example 3: Non-Normally Distributed Data (Dice Rolls - Uniform, Discrete)

Rolling a six-sided die multiple times (e.g., 200 times) results in a distribution that is non-normal for two reasons:

Uniform Shape: Instead of a central peak, the data are roughly flat, with each outcome (1 to 6) having approximately equal frequency.
Discrete Values: A normal distribution assumes continuous data, meaning values like 2.5 or 4.6 are possible. Dice rolls, however, are discrete (only integers 1-6 are possible), which fundamentally violates the continuous nature of a normal distribution.

QQ Plots (Quantile-Quantile Plots)

A QQ plot is a more formal statistical tool to assess normality than simple visual inspection of a histogram.

Explanation of a QQ Plot

A QQ plot graphically compares the quantiles of your sample data against the quantiles of a theoretical normal distribution. It is a "quantile-quantile" plot.
The y-axis plots the actual observed data quantiles (e.g., observed extroversion scores).
The x-axis plots the theoretical quantiles (z-scores) we would expect if the data were perfectly normally distributed. Zero on the x-axis represents the mean, and \pm 1 represents \pm 1 standard deviation from the mean of this theoretical distribution.

Interpreting QQ Plots

Normally Distributed Data: If the sample data are normally distributed, the points on the QQ plot will fall approximately along a diagonal line. This indicates a good match between the observed data distribution and the theoretical normal distribution.
Non-Normally Distributed Data (Anxiety Symptoms): For positively skewed data like anxiety symptoms, the QQ plot will show significant deviations from the diagonal line. The data points will initially be flatter than the diagonal (indicating many low scores) and then curve upwards sharply at the high end, showing that extreme scores are more frequent than a normal distribution would predict.
Non-Normally Distributed Data (Dice Rolls): For discrete, uniformly distributed data like dice rolls, the QQ plot will show distinct horizontal clusters of points, particularly at the ends (for values like 1 and 6), rather than a continuous line. This stair-step pattern highlights the discrete nature of the data, which a continuous normal distribution cannot capture well.

Rule of Thumb for QQ Plots

Some researchers suggest calculating the correlation between the theoretical quantiles and the actual sample quantiles. If this correlation is above 0.95, it can be a rough indicator that the data are sufficiently normally distributed for most practical purposes.

Shapiro-Wilk Test

The Shapiro-Wilk test is a formal statistical test used to determine if a sample comes from a normally distributed population.

Null Hypothesis of Shapiro-Wilk Test

The null hypothesis (H_0) for the Shapiro-Wilk test is that the data are normally distributed.

Interpreting Shapiro-Wilk Results

Significant p-value (< 0.05): If the p-value from the Shapiro-Wilk test is less than 0.05, we reject the null hypothesis. This means there is sufficient evidence to conclude that the data are NOT normally distributed.
Non-significant p-value (> 0.05): If the p-value is greater than 0.05, we fail to reject the null hypothesis. This suggests that there is no significant evidence to indicate that the data deviate from a normal distribution; thus, it is reasonable to accept the assumption of normality.

Examples of Shapiro-Wilk Test Outcomes

Extroversion Data (Normally Distributed): Shapiro-Wilk p-value = 0.99. This non-significant p-value supports the assumption of normality.
Anxiety Data (Non-Normally Distributed): Shapiro-Wilk p-value < 0.01. This significant p-value indicates that the data are not normally distributed.
Dice Rolling Data (Non-Normally Distributed): Shapiro-Wilk p-value < 0.01. This significant p-value also indicates non-normality.

Based on these results, for the anxiety and dice-rolling data, nonparametric statistics would be the more appropriate choice.

Advantages and Disadvantages of Nonparametric Statistics

Advantages

Fewer Assumptions: The primary advantage is that nonparametric tests do not assume a specific data distribution (e.g., normality), making them applicable to a wider range of datasets, especially when parametric assumptions are violated.
Robustness: They are more robust to outliers and skewed distributions.

Disadvantages

Less Statistical Power: In general, if the assumptions for a parametric test are met, parametric tests have more statistical power. This means they are more likely to correctly detect an effect (reject a false null hypothesis) than their nonparametric counterparts.

When to Use Which

Parametric tests are generally preferred when their assumptions (especially normality) are met because they offer greater statistical power. However, when parametric assumptions are significantly violated, nonparametric tests are the appropriate and recommended alternative to ensure valid statistical inferences.

Nonparametric Equivalents of Parametric Tests

This course will cover specific nonparametric counterparts to previously learned parametric tests:

The Spearman correlation analysis is the nonparametric equivalent of the Pearson correlation.
The Wilcoxon signed-rank test is the nonparametric equivalent of the one-sample t-test.
The Wilcoxon rank-sum test is the nonparametric equivalent of the independent samples t-test.
These nonparametric tests offer alternative ways to answer similar research questions without relying on strict distributional assumptions.