Lecture 10 - February 05

Continuous data refers to quantitative data that can take an infinite number of values within a range, such as height, weight, or temperatures.
The median is a measure of central tendency utilized specifically for continuous data rather than categorical data.

Parametric Tests: Generally more powerful and require certain assumptions.
- Normal Distribution: Determines if the data follows a bell-shaped distribution.
- Equal Variances: Suggested to have comparable variability across groups.

Visual Inspection: Use standard deviation and spread of data (eyeballing it).
QQ Plot (Quantile-Quantile Plot):
- Compares two probability distributions visually.
- Data points that lie along a linear line indicate normal distribution.
- If data points curtail off either end, this signifies that data isn't normally distributed.
Shapiro-Wilk Test:
- Statistical test assessing the normality of data.
- Null hypothesis: Data is normally distributed.
- Critical p-value: If p < 0.05, data is considered non-normally distributed.
Histogram: Bar charts for visual representation of data distribution with superimposed normal distribution for comparison.

Researchers may apply transformations (log transformations or geometric means) to deal with non-normal distributions.
Commonly observed in nutritional studies like iron and B12 levels among different populations.

Type of Data: Categorized as either:
- Inferential
- Descriptive
- Continuous (interval and ratio scales) vs Categorical (quantitative categories).
Participant Groups: Can be classified as:
- Independent Participants
- Paired Participants
Correlation of Measurements: Examine independence through study design.
Assumptions for Parametric Tests: Ensure data meets normality and equal variance criteria to proceed with parametric testing.

Compares means of two groups (e.g. vitamin D levels between students at different universities).
Null Hypothesis: No difference in means between the groups.
Types of Sample Data: Continuous.
- Should be independently collected.

Example findings might display:
- Mean of Mount: 52; CI: 45-58
- Mean of Acadia: 75; CI: 67-83
- p-value = 0.03 suggests statistical significance.
- If confidence intervals do not overlap, means are statistically different.

Apply when parametric assumptions of normality or equal variances fail.
Mann-Whitney U Test would be utilized for independent groups with skewed data distributions.

Categorical data comparison; examines observed vs expected frequencies to identify differences across populations, such as nutritional status.
p-values derived from chi-square analysis determine statistical significance.

ANOVA (Analysis of Variance): Used when comparing means across three or more groups; helps adjust the p value and maintain alpha level without increasing type I error risk due to multiple comparisons.

Post hoc tests (e.g., Tukey’s) required to pinpoint which specific groups differ following ANOVA results.

Correlation Coefficient (r) quantifies strength of relationship between two variables from -1 (perfect negative) to +1 (perfect positive) with 0 meaning no correlation.
Regression Models:
- Y = mX + b: Statistical modeling predicting outcomes based on independent variable's changes.
- R-squared indicates the percentage of variance explained by the independent variable in the model (ranging from 0 to 1).

Correlation (r) vs R-squared: Confusion is common, understand their different ranges and what they represent.
Nonparametric correlation (Spearman’s rho) applicable when data does not meet parametric assumptions.

Statistical findings should always include context: p-values, confidence intervals, descriptive statistics that elucidate data significance and practical relevance.
Correlational results should clarify their curative implications and highlight clearly the limits of what they indicate (i.e., Association does not imply causation).