Standard Deviation
Standard Deviation
Definition: The standard deviation (SD) is a measure of the amount of variation or dispersion in a set of values. A low SD indicates that the values tend to be close to the mean, while a high SD indicates that the values are spread out over a wider range.
Formula for Standard Deviation
The formula for calculating standard deviation for a sample is:
S = rac{
ewline\sum{i=1}^{n}(xi - ar{x})^2}{n - 1}Where:
$x_i$: each value in the data set
$ar{x}$: the mean of the data set
$n$: number of values in the data set
Example Calculation:
Given values:
$x$: [5, 0, 10, 5, 25, 3, -2, 4, 7, 2, 2, -3]
Step 1: Calculate the mean ($ar{x} = 4$)
Step 2: Calculate the individual squared deviations from the mean:
$(5 - 4)^2 = 1$, $(0 - 4)^2 = 16$, etc.
Step 3: Sum these squared deviations:
Total: $SS = 46$
Step 4: Calculate SD:
S = rac{46}{6 - 1} = 9.2
The final result is:
SD = rac{ ext{sqrt}(46)}{5} ext{ or } 3.03
Characteristics of Standard Deviation
Positive Value: Standard deviation can never be negative. It is either zero (if all scores are identical) or positive (if there is variation).
Quantitative Variables: The standard deviation is used to describe quantitative variables, being expressed in numeric units.
Informative Value: Standard deviation is most informative when reported alongside the mean, as it reveals the distribution of data points relative to the mean, often represented as (M ± SD).
Sensitivity to Data: The value of the SD is influenced by every value in the dataset:
Constant Addition/Subtraction: Adding or subtracting a constant from all data points does not change the SD.
Constant Multiplication/Division: Multiplying or dividing all data points by the same constant does change the SD proportionally.
Empirical Rule (68-95-99.7 Rule)
This rule describes the percentage of data that falls within one, two, or three standard deviations from the mean in a normal distribution:
Approximately 68% of scores fall within 1 standard deviation from the mean
Approximately 95% fall within 2 standard deviations
Approximately 99.7% fall within 3 standard deviations
Example:
If the mean score is 6 and the SD is 2.5, scores typically fall between:
$M - 2(SD) = 1$ and $M + 2(SD) = 10$ (This interval represents about 95% of scores).
Data Visualization
Types of Graphs Used for Data Representation:
Nominal or Ordinal Scales:
Pie Chart: Used when categories are mutually exclusive; must sum to 100% based on frequency of observation.
Bar Graph: Used for discrete and categorical data, summarizes the frequency of specific units or categories.
Interval or Ratio Scales:
Histogram: Visualizes the distribution of continuous data by representing intervals as vertical rectangles, where each rectangle closely touches the next.
Line Graph: Displays data points in a time series or continuous format.
Standard Normal Distribution
A special type of normal distribution where:
The mean (M) equals 0
The standard deviation (SD) equals 1.
Z-Score: A z-score indicates how many standard deviations an element is from the mean.
Transformation formula: Z = rac{X - M}{SD}
Example of z-score calculation:
If M = 12, SD = 2, and X = 14:
Z = rac{14 - 12}{2} = 1
Interpretation of Z-Scores:
Positive z-scores indicate a score above the mean, while negative z-scores indicate a score below the mean.
Examples of Z-Scores Calculations
Example 1: For I.Q. Data:
Mean IQ = 100, SD = 15
If X = 132:
Z = rac{132 - 100}{15} = 2.13
Example of probability using unit normal table to find areas between z-scores.
Example 2: For another distribution with mean = 14, SD = 3:
If X = 19.1:
First, calculate the z-score:
Z = rac{19.1 - 14}{3} = 1.7
Practical Application of Standard Deviation
The standard deviation is critical in statistical analysis, especially when considering probabilities related to scores within a range. It is important in decision-making processes where understanding variability is essential, such as in fields like psychology, finance, and quality control.
Unit Normal Table: Provides probabilities for different z-scores, showing the area between a z-score and the mean, and the area between the z-score and the tail.
Example Probability Calculation: If evaluating an IQ score of 20 with mean 100 and SD 15, calculations involve finding the corresponding z-scores and interpreting them with the unit normal table to ascertain distribution percentages.