Standard Deviation

Standard Deviation

  • Definition: The standard deviation (SD) is a measure of the amount of variation or dispersion in a set of values. A low SD indicates that the values tend to be close to the mean, while a high SD indicates that the values are spread out over a wider range.

Formula for Standard Deviation

  • The formula for calculating standard deviation for a sample is:
    S = rac{
    ewline\sum{i=1}^{n}(xi - ar{x})^2}{n - 1}

    • Where:

      • $x_i$: each value in the data set

      • $ar{x}$: the mean of the data set

      • $n$: number of values in the data set

  • Example Calculation:

    • Given values:

      • $x$: [5, 0, 10, 5, 25, 3, -2, 4, 7, 2, 2, -3]

    • Step 1: Calculate the mean ($ar{x} = 4$)

    • Step 2: Calculate the individual squared deviations from the mean:

      • $(5 - 4)^2 = 1$, $(0 - 4)^2 = 16$, etc.

    • Step 3: Sum these squared deviations:

      • Total: $SS = 46$

    • Step 4: Calculate SD:

      • S = rac{46}{6 - 1} = 9.2

      • The final result is:

      • SD = rac{ ext{sqrt}(46)}{5} ext{ or } 3.03

Characteristics of Standard Deviation

  1. Positive Value: Standard deviation can never be negative. It is either zero (if all scores are identical) or positive (if there is variation).

  2. Quantitative Variables: The standard deviation is used to describe quantitative variables, being expressed in numeric units.

  3. Informative Value: Standard deviation is most informative when reported alongside the mean, as it reveals the distribution of data points relative to the mean, often represented as (M ± SD).

  4. Sensitivity to Data: The value of the SD is influenced by every value in the dataset:

    • Constant Addition/Subtraction: Adding or subtracting a constant from all data points does not change the SD.

    • Constant Multiplication/Division: Multiplying or dividing all data points by the same constant does change the SD proportionally.

Empirical Rule (68-95-99.7 Rule)

  • This rule describes the percentage of data that falls within one, two, or three standard deviations from the mean in a normal distribution:

    • Approximately 68% of scores fall within 1 standard deviation from the mean

    • Approximately 95% fall within 2 standard deviations

    • Approximately 99.7% fall within 3 standard deviations

  • Example:

    • If the mean score is 6 and the SD is 2.5, scores typically fall between:

      • $M - 2(SD) = 1$ and $M + 2(SD) = 10$ (This interval represents about 95% of scores).

Data Visualization

  • Types of Graphs Used for Data Representation:

    1. Nominal or Ordinal Scales:

      • Pie Chart: Used when categories are mutually exclusive; must sum to 100% based on frequency of observation.

      • Bar Graph: Used for discrete and categorical data, summarizes the frequency of specific units or categories.

    2. Interval or Ratio Scales:

      • Histogram: Visualizes the distribution of continuous data by representing intervals as vertical rectangles, where each rectangle closely touches the next.

      • Line Graph: Displays data points in a time series or continuous format.

Standard Normal Distribution

  • A special type of normal distribution where:

    • The mean (M) equals 0

    • The standard deviation (SD) equals 1.

  • Z-Score: A z-score indicates how many standard deviations an element is from the mean.

    • Transformation formula: Z = rac{X - M}{SD}

    • Example of z-score calculation:

      • If M = 12, SD = 2, and X = 14:

        • Z = rac{14 - 12}{2} = 1

  • Interpretation of Z-Scores:

    • Positive z-scores indicate a score above the mean, while negative z-scores indicate a score below the mean.

Examples of Z-Scores Calculations

  1. Example 1: For I.Q. Data:

    • Mean IQ = 100, SD = 15

      • If X = 132:

        • Z = rac{132 - 100}{15} = 2.13

    • Example of probability using unit normal table to find areas between z-scores.

  2. Example 2: For another distribution with mean = 14, SD = 3:

    • If X = 19.1:

      • First, calculate the z-score:

        • Z = rac{19.1 - 14}{3} = 1.7

Practical Application of Standard Deviation

  • The standard deviation is critical in statistical analysis, especially when considering probabilities related to scores within a range. It is important in decision-making processes where understanding variability is essential, such as in fields like psychology, finance, and quality control.

  • Unit Normal Table: Provides probabilities for different z-scores, showing the area between a z-score and the mean, and the area between the z-score and the tail.

  • Example Probability Calculation: If evaluating an IQ score of 20 with mean 100 and SD 15, calculations involve finding the corresponding z-scores and interpreting them with the unit normal table to ascertain distribution percentages.