Psychological Statistics Review

Review of Probability and Hypothesis Testing

Homework #3: Due today, Friday, October 24 by 11:59 pm.
- Submit via Moodle.
- Note: Generative AI is not allowed.
Exam 2: Scheduled for Wednesday, October 29.
- Review session on Monday, October 27, during lecture and lab.
- Recommended materials: Review slides, lab assignments, feedback on labs, and textbook readings (Chapters 6, 7, and 8).
- Similar questions will appear on the exam.
- Formula sheet for Exam 2 available on Moodle.

Focus will be on integrating probability, the distribution of sample means, and hypothesis testing.
Note: Hypothesis testing using the t-statistic versus z-score is postponed until after Exam 2.

Probability: Establishes a link between populations and samples.
- Identifies the likelihood of obtaining specific samples if the composition of the population is known.
- In inferential statistics, the process is reversed: we utilize a sample to derive conclusions or generalizations about a population.

Population vs. Sample in Probability:
- Probability (denoted as p) helps to determine the likelihood of drawing a sample from a population, contingent on normal distribution principles.
Hypothesis Testing: Based on setting critical regions determined by a pre-defined alpha level.
Inferential Statistics: Implements data derived from a sample (e.g., mean) to conclude about the broader population, particularly in evaluating the evidence for the null hypothesis.

Probability Formula:
- p(A) = \frac{\text{number of outcomes of } A}{\text{total number of possible outcomes}}
Requirements for Random Sampling:
1. Each individual in the population must have an equal chance of selection.
2. If multiple individuals are selected, the probability must remain constant across selections.
Sampling with Replacement: Ensures these conditions are met.

Probability and Frequency Distribution:
- A frequency distribution graph encapsulates the entire population.
- Probability of selecting a sample within a specific range of the distribution is calculated considering the number of outcomes relating to that range.
- Example:
- For a group of 10 people, if 3 have scores below 3,
  - p(X < 3) = \frac{3}{10}
- If 2 score greater than 4, then,
  - p(X > 4) = \frac{2}{10}

Normal Distribution: Describes the frequency distribution of populations, recognized as the most common shape of population data.
- The distribution is characterized by its symmetry and bell-shaped curve.
Core Principles:
- Central region reflects higher frequency of scores, whereas tails exhibit lower frequency and probability.

Unit Normal Table: Replaces graphical representation for direct calculation of proportions in normal distributions, assisting with z-score interpretation.
- The z-score acts as a divider, producing two sections: Body (larger section) and Tail (smaller section).
Table Structure:
- Column A: Critical z-value.
- Column B: Proportion in Body.
- Column C: Proportion in Tail.

When evaluating samples larger than n=1:
- Use the sample mean instead of individual scores to derive population conclusions.
The z-score of the sample mean is utilized to describe its position in relation to the total population:
- z = \frac{M - \mu}{\sigma_M}
Sampling error reinforces the necessity of understanding distributions and their associated probability.

Definition: Collection of all possible sample means drawn from a population, providing a robust framework for probability assessment.
Central Limit Theorem (CLT): Describes properties of the distribution of sample means:
- Shape: Approximation to normal distribution under certain conditions (population is normal or sample size n ≥ 30).
- Mean: The mean of sample means corresponds to the population mean, represented as \mu_M.
- Variability:
- The standard deviation for the distribution of sample means is called the Standard Error of the Mean (SEM):
  - Notation: \sigma_M
  - Formula: \sigma_M = \frac{\sigma}{\sqrt{n}}

Employ properties from the CLT when analyzing probabilities related to sample means.
- For normalized distributions, adapt z-scores and make use of the Unit Normal Table to derive probabilities in segments defined by z-scores.

Hypothesis Testing Framework:
- Statistical analyses aim to discern the effects of treatments within populations based on sample evaluations.
- Make crucial assumptions about treatment effects affecting individual scores without altering standard deviation or distribution shape.
Types of Hypotheses:
- Null Hypothesis (H0): Predicts no treatment effect, represented as:
- \mu{after} = \mu{before}
- Alternative Hypothesis (H1): Predicts treatment effect exists, which can be:
- Non-directional: \mu{after} \neq \mu{before}
- Directional: If hypothesized direction (increase or decrease) is specified, express with:
  - \mu{after} > \mu{before} or \mu{after} < \mu{before}

State the Hypotheses: Confirm the null and alternative hypotheses.
Determine Evidence: Assess sample data for indications of treatment effects through probability measures.
Compare Sample Means: Using derived z-scores to critical values determines significance by observing whether sample means fall in the critical region.

Significance Level (-alpha): Primarily defines the boundary between likely and unlikely sample means.
- Sets the proportion of critical regions under H0.
- For instance, an alpha of 0.05 implies that 5% of the distribution is reserved for critical regions.
Critical z-score Example: Given alpha = 0.05, the critical z-scores correspond to +/-1.96 (non-directional hypothesis).

Data collection permits the derivation of sample means and their z-scores, with comparison against critical z-values (+/- 1.96).
Decisions on whether to reject H0 are based on whether the sample mean lies outside the critical region, suggesting evidence of treatment effects.
For directional hypotheses, modify critical z-values according to the predicted increase or decrease, e.g., use +1.65 for an expected increase.

Hypothesis testing is a practical application of probability principles and normal distribution characteristics, providing insights into population effects post-treatment.
Ongoing evaluations within research frameworks ensure evidence is continuous, thereby refining conclusions.