Study Notes for STAT 1000: Density Curves & Normal Distributions

Density Curves

  • Definition: Density curves provide a model for the distribution of a continuous random variable.

    • They offer an overall picture without considering small irregularities or outliers.

    • Smooth curves are easier to work with than histograms.

Histogram Example

  • A histogram of test scores for 5,0005,000 high school students taking a provincial math exam illustrates the distribution of scores.

  • Observations:

    • Scores exhibit a fairly regular, symmetric distribution.

    • Pattern descends smoothly from the center; no gaps or outliers present.

  • Fitting a Curve: A smooth curve can be fitted to approximate the histogram's distribution, representing a mathematical model of the data.

Area Representation

  • The total area of the curve represents proportions of observations.

  • Like histograms, the area underneath the density curve can be scaled to equal one, indicating relative frequencies or proportions within intervals.

    • For example, the proportion of students scoring less than 6060 can be calculated as follows:

      • Representing 1,6001,600 students: 16005000=0.3200\frac{1600}{5000} = 0.3200.

Properties of Density Curves

  • All density curves have three key properties:

    • They lie entirely above the xx-axis.

    • The area under the curve is equal to one.

    • They represent a proper function: each value of xx corresponds to a unique value of yy.

Large Populations

  • When dealing with large populations (e.g., heights, incomes, GPAs):

    • Continuous populations lead to smoother histograms approximating density curves.

    • For infinitely large populations, histograms can have as many intervals as needed, producing a density curve.

Uniform Distribution

  • Definition: A uniform distribution is one in which all values in a range are equally likely.

  • Mathematical Properties:

    • Area under the curve equals the area of a rectangle defined by base and height.

    • For example, in the interval [0,1][0, 1]: Area =(10)height=1= (1-0) \cdot \text{height} = 1, confirming it's a valid density curve.

    • For an interval [0.45,1][0.45, 1]: Proportion =Area=0.451=0.45= \text{Area} = 0.45 \cdot 1 = 0.45.

Finding Proportions in a Uniform Distribution

  • Calculation of proportions falling within intervals is straightforward due to the shape of uniform distribution:

    • For any range [a,b][a, b], the calculation is Area=(ba)height\text{Area} = (b - a) \cdot \text{height}.

Example Problems

  • Area between values: Calculate the area (proportion of observations) between 1.61.6 and 3.33.3.

    • Result using the uniform distribution height will yield P(1.6 < X < 3.3) = (3.3 - 1.6) \cdot \text{height} = 1.7 \cdot 0.25 = 0.425.

  • Solving Percentiles: Find values corresponding to a proportion (e.g., 10%10\% of observations in a uniform distribution).

Triangular Distribution

  • This type of distribution can be defined with its maximum height hh by ensuring that the area sums to 11. For a triangle:

    • Area =12baseh= \frac{1}{2} \cdot \text{base} \cdot h. For the triangle with a defined base of 55.

    • The height must equal h=25h = \frac{2}{5} for normalization.

Parameters & Statistics

  • Today’s statistics class defined key terms:

    • Sample mean xˉ\bar{x} vs. population mean μ\mu.

    • Sample standard deviation ss vs. population standard deviation σ\sigma.

    • Distinction of parameters (describes populations) vs. statistics (from samples).

Example Problem

  • MGD Beer example illustrating sample statistics (average content of cans) and indicating whether the values are parameters or statistics.

The Normal Distribution

  • A specific type of density curve, known for its bell shape and symmetric distribution. Key characteristics include:

    • Defined by parameters: population mean μ\mu and standard deviation σ\sigma.

    • The distribution is symmetric and has a total area under the curve equal to one.

  • Notation: The normal distribution is denoted as XN(μ,σ)X \sim N(\mu, \sigma).

Properties of Normal Distribution

  • Two main parameters:

    • Mean (μ\mu): Location of the center of the distribution.

    • Standard deviation (σ\sigma): Measure of spread; must be positive.

Relationship Between Parameters

  • The mean indicates central location, while the standard deviation reflects the spread:

    • For an example with μ=100\mu = 100 and σ=10\sigma = 10, the normal distribution is defined by the equation: XN(100,10)X \sim N(100, 10).

68-95-99.7 Rule

  • Key statistical rule for normal distributions:

    • Approximately:

      • 68%68\% of values fall within one standard deviation of the mean (μ±σ\mu \pm \sigma).

      • 95%95\% of values fall within two standard deviations (μ±2σ\mu \pm 2\sigma).

      • 99.7%99.7\% of values fall within three standard deviations (μ±3σ\mu \pm 3\sigma).

Example Application

  • If XN(150,20)X \sim N(150, 20), the proportion between 130130 and 170170 falls, according to the rule, within 68%68\% .

  • The proportion for values between 110110 and 190190 can be calculated as 95%95\% , indicating coverage within 22 standard deviations.

Standard Normal Distribution

  • A special type of normal distribution with:

    • Mean μ=0\mu = 0, standard deviation σ=1\sigma = 1.

    • Denoted by the variable ZZ, where ZN(0,1)Z \sim N(0, 1).

Transforming To Standard Normal

  • To convert a normal variable XX into a standard normal variable ZZ, calculate the z-score as follows:

    • Z=xμσZ = \frac{x - \mu}{\sigma}.

Example Calculations

  • For a height of 187187 cm when XN(178,6)X \sim N(178, 6):

    • Z=1871786=1.5Z = \frac{187 - 178}{6} = 1.5. This means the height is 1.51.5 standard deviations above the population mean.

Proportion Calculations with Standard Normal Distribution

  • Techniques for finding proportions using the ZZ transformation process:

    • Sketch the normal curve, shading the area required for the computation. Use properties and known values from the standard normal table to derive answers.

Example problems

  1. For P(-1 < Z < 1) the probability is approximately 0.680.68.

  2. For P(Z > 2), details of symmetry can be used for computation leading to 0.0250.025 through the left area corresponding to Z < -2.

Backward Normal: Finding z-values

  • Methods of determining values zz corresponding to known proportions: find desired values in tables and transforming as necessary based on the projected proportions.

Finding Percentiles & Quantile Ranges

  • Steps to find specific values corresponding to given percentiles, involving locating values through known standard proportion tables and computation adjustments due to existing symmetry.

    • Example: Determine the interquartile range of a standard normal distribution by finding Q1Q1 and Q3Q3.

Conclusion

  • Summary of how to approach finding areas under the normal curve, how to transform and find corresponding mean and standard deviations, as well as predicting values relative to targeted proportions. Examples provided illustrate practical applications of these concepts within various contexts, including the evaluation of normal variables methodology outlined.

Next Steps: Unit 05 - Probability & Sampling Distribution

Probability (Unit 05)

  • Definition: Probability quantifies the likelihood of an event occurring. It is a value between 00 and 11 (or 0%0\% and 100%100\%).

    • P(event)=Number of favorable outcomesTotal number of possible outcomesP(\text{event}) = \frac{\text{Number of favorable outcomes}}{\text{Total number of possible outcomes}}

  • Key Concepts:

    • Experiment: A process that leads to well-defined outcomes.

    • Outcome: A single possible result of an experiment.

    • Sample Space (SS): The set of all possible outcomes of an experiment.

    • Event: A subset of the sample space; a collection of one or more outcomes.

Rules of Probability

  1. Rule 1: Probability Range: For any event AA, 0P(A)10 \le P(A) \le 1.

  2. Rule 2: Sum of Probabilities: The sum of probabilities of all possible outcomes in a sample space is 11. P(S)=1P(S) = 1.

  3. Rule 3: Complement Rule: The probability that an event AA does not occur is P(Ac)=1P(A)P(A^c) = 1 - P(A).

  4. Rule 4: Addition Rule for Disjoint Events: If two events AA and BB are disjoint (mutually exclusive), meaning they cannot occur at the same time, then P(A or B)=P(A)+P(B)P(A \text{ or } B) = P(A) + P(B).

    • Example: Rolling a 11 or a 66 on a single die roll.

  5. Rule 5: General Addition Rule: For any two events AA and BB (not necessarily disjoint), P(A or B)=P(A)+P(B)P(A and B)P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B).

    • P(A and B)P(A \text{ and } B) is the probability that both AA and BB occur.

  6. Rule 6: Multiplication Rule for Independent Events: If two events AA and BB are independent, meaning the occurrence of one does not affect the occurrence of the other, then P(A and B)=P(A)P(B)P(A \text{ and } B) = P(A) \cdot P(B).

    • Example: Flipping a coin twice and getting heads both times.

  7. Rule 7: Conditional Probability: The probability of event BB occurring given that event AA has already occurred is denoted as P(BA)P(B|A).

    • P(BA)=P(A and B)P(A)P(B|A) = \frac{P(A \text{ and } B)}{P(A)}, provided P(A) > 0.

  8. Rule 8: General Multiplication Rule: For any two events AA and BB (not necessarily independent), P(A and B)=P(A)P(BA)P(A \text{ and } B) = P(A) \cdot P(B|A).

Random Variables

  • Definition: A random variable is a numerical outcome of a random phenomenon.

  • Types of Random Variables:

    • Discrete Random Variable: A random variable that can take on a finite or countably infinite number of values. These values are often integers.

      • Examples: Number of heads in 3 coin flips (0,1,2,30, 1, 2, 3), number of cars passing a certain point in an hour.

    • Continuous Random Variable: A random variable that can take on any value within a given range.

      • Examples: Height, weight, temperature, time. (As discussed with density curves).

Probability Distributions for Discrete Random Variables

  • A probability distribution for a discrete random variable lists all possible values the variable can take and their corresponding probabilities.

  • Properties:

    • 0P(X=x)10 \le P(X=x) \le 1 for each possible value xx.

    • P(X=x)=1\sum P(X=x) = 1 (The sum of