Introduction to Z-Score and Normal Distribution

  • Concept of Z-scores discussed to understand the distribution of data.

  • Example given of identifying the bottom 7% of data in a Z-curve.

Finding Z-Score Corresponding to Percentiles

  • Z-score definition: A measure of how far away a point is from the mean, expressed in standard deviations.

  • Bottom 7% of the Z-curve corresponds to the 7th percentile or 93% to the right.

  • Calculation example:

    • Z-score for bottom 7% is approximately negative 1.475 (between negative 1.47 and negative 1.48).

Understanding the Ball Bearings Problem

  • Introduction of the ball bearings problem to identify enough vs. too many instances in hypothesis testing.

  • The indication of effectiveness in statistics: e.g., comparing a 95% vs. 99% effectiveness rate.

Middle Percentiles Discussion

  • When looking for a middle chunk, e.g., middle 90%:

    • 5% each is allocated to the left and right extremes.

    • Negative 2.575 indicates the left threshold for 5%.

  • For calculations:

    • Middle 90% has Z-scores symmetric at negative 1.645 and +1.645.

Transforming Between Z-scores and P-values

  • Key concept: P-values correspond to Z-scores in normal distributions.

  • Strategies for calculating areas under the Z-curve involve subtracting P-values, not Z-scores.

  • Importance of Z-scores being expressed in standard deviations from the mean.

Checking Normality in Distributions

  • Inquiry into determining if data is approximately normal:

    • Importance of visual representations to identify normal distribution from samples.

  • Normal scores introduced as a method:

    • Normal scores depend on sample size (N). For small N, it can be problematic to establish normality.

Working with Normal Scores

  • Normal scores provided for specific sample sizes.

  • For each sample size N, corresponding normal scores (Z-scores) need to be analyzed versus actual dataset Z-scores to determine normality.

Scatter Plots for Visual Confirmation

  • Using scatter plots to determine if the graph of actual Z-scores against normal scores forms a linear pattern.

    • Linear pattern indicates normality in the data.

Discrete vs Continuous Probability Distributions

  • Distinction reviewed:

    • Continuous probability curves like the Z-curve vs. discrete events (e.g., coin flips).

    • Discrete probability examples discussed: binomial and geometric distributions.

    • Key formulas highlighted for success probabilities in various outcomes.

Continuity Correction Methodology

  • Discussion on continuity correction method:

    • Continuous approximations for discrete distributions by slightly adjusting bounds in probability calculations.

    • Adjust bounds by subtracting or adding 0.5 to ensure capturing appropriate probability areas.

    • Explanation with examples provided to illustrate methodology.

Binomial Distribution Normality Condition

  • Normality in binomial distributions confirmed under specific conditions:

    • If Np ≥ 10 and N(1-p) ≥ 10 for distributions to be considered normal.

  • Explanation of mean and its importance in identifying these conditions linked to success and failure counts.

Geometry of Geometric Distributions

  • Discussion on geometric distributions and their constitution:

    • Fundamental allocation of probabilities and their inherent shapes that prevent normality.

  • Provided various probabilities and N values to illustrate how probability shapes affect normality visually.

Conclusion

  • Recap of methodologies for determining normality through normal scores.

  • Notable pointers on continuous vs discrete distributions and pivotal importance of using Z-tables correctly to avoid calculation errors.