chapter 15.3
15.3 Inferential Statistics
Overview of Inferential Statistics
Research studies typically focus on small samples drawn from larger populations, raising questions about representation and accuracy of results.
Sampling Error: Natural discrepancies between sample statistics and population parameters; each sample produces different statistics.
Inferential Statistics Purpose: Draw broader conclusions about populations based on limited sample information, addressing sampling error and its implications.
Hypothesis Tests
A systematic procedure to evaluate if sample data supports the original research hypothesis.
Definition of Hypothesis Test
A statistical procedure using sample data to evaluate a hypothesis about a population.
Distinguishes between systematic relationships in data and random variations.
Elements of a Hypothesis Test
The Null Hypothesis: States there is no effect or relationship; serves as a baseline for testing.
Example: In a treatment comparison, it states no difference between treatments.
The Sample Statistic: Computed from the sample data to compare with the null hypothesis.
The Standard Error: Average size of sampling error; helps to measure how much discrepancy to expect between a sample statistic and a population parameter.
Definition: Average distance between a sample statistic and population parameter.
The Test Statistic: Ratio comparing sample data to what is expected under the null hypothesis; indicates the strength of evidence against the null hypothesis.
The Alpha Level (Level of Significance): Criterion defined before testing to determine statistical significance; acts as a threshold for rejecting the null hypothesis.
Errors in Hypothesis Testing
Type I Error: Concluding an effect exists when it does not (false positive).
Mitigated by setting low alpha levels (e.g., 0.05, 0.01).
Type II Error: Failing to detect a real effect (false negative); occurs with small effect sizes or small samples.
Factors Influencing Hypothesis Test Outcomes
Sample Size: Larger samples yield more stable mean differences; increase the likelihood of significant results.
Large sample means are less affected by individual variability.
Sample Variability: Small variance allows sample statistics to be representative and reliable, while large variance can obscure effects.
High variance diminishes significance of findings.
Effect Size
Critical to supplementing hypothesis tests and interpreting significance.
Cohen’s d: Measure of mean difference relative to the standard deviation, indicating effect size.
Guidelines for evaluating can categorize effects as small (d=0.2), medium (d=0.5), or large (d=0.8).
Variance Accounting (r² and η²): Percentage of variance explained by a variable, with guidelines similar to Cohen’s d.
Confidence Intervals
Definition: Technique estimating the range of an unknown population parameter based on sample statistics.
Width determined by standard error and level of confidence (e.g., 95%, 99%); balancing precision and confidence.
Larger samples yield narrower confidence intervals, increasing estimate precision.
The Fight Against P-Hacking
P-hacking: Manipulating data collection and analysis to yield significant results; threatens validity.
Strategies to curb p-hacking include pre-registration of studies and a greater focus on effect size estimates rather than mere significance.
Dangers of practices like optional stopping in data collection and selective exclusion of outliers can skew results, necessitating stricter guidelines for researchers.