Study Notes on Normal Distribution and Z-Scores

Concepts of Data and Normal Distribution

  • Data Types

    • Advertised Data

    • Data used for analytical work, such as GPA of students, exam scores, and sales figures.

    • Random Variable

    • Represented by notation 'x'.

    • If data is normally distributed, it has a 'mean' (average) and a 'standard deviation' (variability).

  • Normal Distribution

    • Characteristics:

    • Symmetrical bell-shaped curve.

    • Mean = 75, Standard Deviation = 5 (example).

    • Mean and Median are identical in a normal distribution.

    • Importance:

    • Statistical analysis relies on the assumption of normality.

    • Enables calculations of probabilities using the z-score.

Z-Score

  • Definition

    • A statistical measurement that describes a value's relationship to the mean of a group of values.

  • Formula: z=xμσz = \frac{x - \mu}{\sigma}

    • Where:

    • 'x' = score

    • 'μ' = mean

    • 'σ' = standard deviation

  • Interpretation of Z-Scores

    • Positive z-score: Above average

    • Zero z-score: Average

    • Negative z-score: Below average

Applications of Z-Scores and Normal Distribution

  • Finding Percentages Below a Score

    • To find the percentage of students scoring below 70:

    • Calculate z-score for 70: z=70755=1z = \frac{70 - 75}{5} = -1

      • Look up in standard normal table or use Excel command ( =NORM.DIST(70, 75, 5, TRUE) )

    • Result: Approximately 16% of students scored below 70.

  • Finding Percentages Above a Score (i.e., Greater than 90)

    • Z-score for 90:
      z=90755=3z = \frac{90 - 75}{5} = 3

    • Use Excel command for percentages above:
      =1NORM.DIST(90,75,5,TRUE)=1 - NORM.DIST(90, 75, 5, TRUE)

    • This yields a minimal percentage, reflecting that very few scored above 90.

Area Under the Normal Curve

  • Concept:

    • Area under the curve represents the total probability (100%).

    • To find the probability of data between two scores, find the areas under the curve for both scores and subtract.

    • Excel command example:
      =NORM.DIST(upper,mean,std<em>dev,TRUE)NORM.DIST(lower,mean,std</em>dev,TRUE)=NORM.DIST(upper, mean, std<em>dev, TRUE) - NORM.DIST(lower, mean, std</em>dev, TRUE)

Practical Insights and Recommendations

  • Creating Cheat Sheets

    • Recommended to create a one-page cheat sheet with key formulas, Excel commands, and example problems for reference.

  • Excel Usage

    • Important to get familiar with how to use Excel commands for statistical analysis:

    • For cumulative distributions use ( NORM.DIST )

    • For inverse calculations use ( NORM.INV )

    • Notation changes depending on whether it's standard normal or regular distribution.

Sampling and Population Analysis

  • Population vs Sample

    • Population: Full set of data (e.g. all students at a university).

    • Sample: A subset of the population used to infer for the whole.

  • Types of Sampling

    • Random: Every member has a chance of being included.

    • Non-Random: Can introduce bias and result in incorrect conclusions.

Conclusion on Normal Distribution Study

  • Mastery of concepts regarding normal distribution, z-scores, percentages, areas under the curve, practical applications through Excel commands, and understanding sampling techniques is essential for proficiency in data analysis.

  • Review lecture notes, practice problems, and maintain a strong grasp of statistical principles discussed in the course.

  • Regular use of Excel for computations will enhance speed and efficiency in completing analyses.