Probability and Hypothesis

Probability and Hypothesis Testing in Research

Probability Terms: The foundational terms used in probability within a legal context, such as judge decisions represented as heads or tails—"guilty" or "not guilty".
- Complementary Outcomes: Outcomes that cover all possible outcomes and are mutually exclusive (i.e., one outcome happens, and the other cannot).

Example related to expectations of salaries after graduation.
- A comparison between a group of college students and a broader population’s salary estimates.
- Hypotheses:
- Null Hypothesis (H₀): No difference between the sample and the population; they are the same.
- Alternative Hypothesis (H₁): There is a difference between the sample and the population; they are not the same.
Mutually Exclusive Outcomes: The outcomes of these hypotheses are mutually exclusive; only one can be true at a time.

Sampling Distribution: The distribution of means across many samples from a population.
- Sampling distributions of sample means follow the Central Limit Theorem, which states that:
- Regardless of the population's shape, the distribution of sample means approaches a normal distribution as sample size increases.
Unbiased Estimator: The mean is an unbiased estimator of the population mean.
- If all possible samples of a given size are taken, the mean of the sampling distribution equals the population mean.

Previous work with z-scores emphasized:
- 95% of a normal distribution lies within 2 standard deviations from the mean, corresponding to z-scores of -1.96 and +1.96.
- This principle also applies to sampling distributions; 95% of sample means will fall within this range.
Critical Region: In hypothesis testing, decisions are based on whether sample data falls within or outside the critical region (z-scores greater than +1.96 or less than -1.96).

Researchers compare their sample results to the standard of evidence established in prior discussions, leading to decisions based on:
- If the probability of a sample differing from the population mean is < 5% (p < 0.05), reject the null hypothesis (H₀).
- If p > 0.05, retain the null hypothesis (H₀).
Standard of Evidence: Similar to the legal standard that requires conviction beyond a reasonable doubt.

Outcomes of hypothesis tests can lead to two types of errors:
1. Type I Error: Rejecting the null hypothesis when it is actually true (a false positive).
2. Type II Error: Retaining the null hypothesis when it is false (a false negative).
We are more concerned with Type I errors due to their broader implications - similar to a medical misdiagnosis.

The fable serves as a metaphor for Type I and Type II errors, emphasizing the danger of saying there is an effect when none exists (Type I error).

Power of a Statistical Test: The likelihood that it will properly reject the null hypothesis when it is false.
- Power is influenced by sample size, significance level (alpha), and effect size.
A larger sample size increases the probability of detecting a true effect.

Statistical Significance: A result is statistically significant (p < 0.05) when it suggests that the observed difference is unlikely due to chance.
Cohen's d: A measure of effect size, calculating the difference between sample and population means relative to the population standard deviation.
- Formula: d = \frac{\bar{x} - \mu}{\sigma}
- Interpretations:
- Small Effect: d < 0.20
- Medium Effect: 0.20 < d < 0.80
- Large Effect: d > 0.80

Non-Directional Hypothesis: Tests for any difference (two-tailed).
Directional Hypothesis: Tests for a specific direction of difference (one-tailed).
Example uses: Improving student engagement with comfort items in the classroom.

In a study predicting the impact of classroom comfort items on engagement:
- Null Hypothesis (H₀): Comfort items do not change engagement (or decrease).
- Alternative Hypothesis (H₁): Comfort items increase engagement.
Strong evidence against the null leads to rejecting it and supporting the alternative.

The testing approach consists of creating hypotheses, calculating z-values, comparing against critical values, and making decisions (retain or reject H₀).
Common operational phrases will be revisited throughout the semester as we learn various statistical tests.

Random outcomes in hypothesis testing can lead to significant real-world consequences if false claims about scientific results are made.
Trust in scientific conclusions is vital for informed decision-making in public health, policy, and general societal issues.

This lecture encompassed fundamental statistical terms, hypotheses, and decision-making principles critical to formulating and interpreting research. Understanding the implications of statistical significance and errors is crucial for reliable psychological research operations. The eventual goal is to communicate findings clearly and responsibly within scientific contexts, emphasizing the importance of rigorous scientific integrity and ethics.