Group 7

Continuous versus Discrete Random Variables

  • Definitions:

    • Discrete Random Variables: Numeric values that can be listed. Examples include integer values such as the number of classes a student takes, i.e., 0, 1, 2, 3, etc. There are no values possible between integers. The probability distribution table for discrete values sums to 1 (The Law of Total Probability).

    • Continuous Random Variables: Numeric values that can take any value on the number line. Examples include measurements such as the distance driven, which can have an infinite number of decimal places between any two values.

  • Key Concepts:

    • The probabilities for discrete distributions can be displayed using bar charts and include mass at specified values.

    • For discrete probabilities:

    • P(X = 1) = rac64132rac{64}{132} (exactly equal to 1)

    • P(X > 1) = rac56132rac{56}{132} (excludes 1)

    • P(X ≥ 1) includes the probability of 1.

    • Understanding operators is crucial:

    • =: exactly equal to

    • > or <: excludes endpoint

    • ≥ or ≤: includes endpoint

  • Continuous Distributions:

    • Cant denote specific values for continuous random variables; instead, we calculate probabilities within given intervals, e.g., P(1.6 < X < 2.4).

    • The probability at a single point is always 0. Therefore, P(1.6 < X < 2.4) is equal to itself, as there is no mass of probability at individual values.

Types of Continuous Distributions

  • The course will cover four main continuous distributions:

    1. Normal Distribution

    2. Uniform Distribution

    3. Student’s t-distribution

    4. Chi-Squared Distribution

  • It is highly recommended to print probability tables for these distributions, as they will be heavily utilized throughout the semester and are handy during tests.

Normal Distribution

  • Characteristics:

    • Bell-shaped: Known as the bell curve due to its shape.

    • Centered at Mean (μ): The distribution is centered around the mean value.

    • Symmetric: The left half is a mirror image of the right half about the mean.

    • Infinite Range: It theoretically ranges from negative infinity to positive infinity.

  • Parameters:

    • Mean (μ): Can take any value on the real number line and determines the center of the curve.

    • Standard Deviation (σ): Must be a positive value; larger values make the curve more spread out, while smaller values make it compact.

  • Visual Examples:

    • Curves may have the same mean but different standard deviations, showing variability in data spread.

    • Curves may have different means but the same standard deviation, illustrating the same spread around different centers.

Calculating Probabilities in Normal Distribution

  • Total probability for any random variable (X) in continuous form integrates to 1.

  • Probabilities are calculated for intervals, denoted by the area under the curve in a specified range.

    • Example 1: Employee drive distances are normally distributed with a mean (μ) of 10 miles and standard deviation (σ) of 3 miles. To find P(X > 15):

    • The relevant area is shaded/dotted for distances above 15 miles.

    • Example 2: Test scores with mean 80.6 and standard deviation 6.4, finding P(X < 94.7) involves identifying the area left of that score.

  • Example 3: Lifetimes of machine components with mean 17 months and standard deviation 3.8. To find the probability that the lifetime lies between 13 and 16 months, we calculate P(13 < X < 16).

  • Important Information:

    • Total area = 1.

    • Area = probability: The larger the area represented by a shaded region, the higher the probability.

    • Empirical Rule:

    • 68% of values lie within 1 standard deviation from the mean.

    • 95% lie within 2 standard deviations.

    • Approximately 100% lie within 3 standard deviations.

Standard Normal Distribution

  • The standard normal distribution has μ = 0 and σ = 1. The random variable is denoted as Z.

  • The Z-table provides areas to the left of Z-scores.

  • Finding Probabilities:

    • Example: P(Z < -2.84) involves looking up the value for the area left of Z = -2.84, resulting in P(Z < -2.84) = 0.0023.

    • Example: P(Z > -2.84) would be calculated using the total area law, resulting in P(Z > -2.84) = 1 - P(Z < -2.84) = 0.9977.

  • For positive Z-scores, use the same approach:

    • Example: Find P(Z < 1.15) yields 0.8749, and proceeding to find areas to the right involves similar complement calculations.

Finding Areas Using Symmetry

  • In normal distribution, the left and right areas correspond. If P(Z < -2.01) is determined, it is equivalent to P(Z > 2.01).

  • Example: For P(Z > 2.37), calculate using both complement and symmetry methods:

    • Look up value in Z-table and subtract from 1, or find from the negative equivalent value.

Conversion to Standard Normal

  • When dataset values don’t conform to standard distribution (mean 0, sd 1), convert using:
    Z=XμσZ = \frac{X - \mu}{\sigma}

  • Example: For test scores with mean 78 and standard deviation 6, to find the probability a score is less than 70.5:

    • Convert to Z-score: Z = \frac{70.5 - 78}{6} = -1.25.

    • Use the Z-table to establish probabilities.

Additional Applications of Normal Distribution

  • For different scenarios, we determine X values given probabilities: e.g., finding scores that correspond to the lowest 4% using Z-scores from the chart and reverting using the conversion formula to find specific scores.

Normal Approximation to Binomial Distribution

  • This is valid under certain conditions where: (n)(p) > 5 and (n)(q) > 5 (or sometimes 10).

  • Calculate the mean: μ=nimesp\mu = n imes p and σ=(n)(p)(q)\sigma = \sqrt{(n)(p)(q)} for binomial distributions.

  • Continuity Correction is used for approximating probabilities by adjusting the X value by 0.5 in order to better align with the continuous nature of normal distributions.

Testing for Normality

  • Normality can be visually checked through histograms or QQ plots.

  • For instance:

    • A histogram that is bell-shaped indicates normality (as examined with Set A).

    • A QQ plot showing data points along a 45° line suggests normal distribution.

  • Z-scores can be calculated using methods involving rank and area percentages to check for normality.

Continuous versus Discrete Random Variables
  • Definitions:

    • Discrete Random Variables: Numeric values that can be listed. Examples include integer values such as the number of classes a student takes, i.e., 0, 1, 2, 3, etc. There are no values possible between integers. The probability distribution table for discrete values sums to 1 (The Law of Total Probability).

    • Continuous Random Variables: Numeric values that can take any value on the number line. Examples include measurements such as the distance driven, which can have an infinite number of decimal places between any two values.

  • Key Concepts:

    • The probabilities for discrete distributions can be displayed using bar charts and include mass at specified values.

    • For discrete probabilities:

    • P(X = 1) = 64132\frac{64}{132} (exactly equal to 1)

    • P(X > 1) = 56132\frac{56}{132} (excludes 1)

    • P(X \ge 1) includes the probability of 1.

    • Understanding operators is crucial:

    • =: exactly equal to

    • > or <: excludes endpoint

    • \ge or \le: includes endpoint

  • Continuous Distributions:

    • Cant denote specific values for continuous random variables; instead, we calculate probabilities within given intervals, e.g., P(1.6 < X < 2.4).

    • The probability at a single point is always 0. Therefore, P(1.6 < X < 2.4) is equal to itself, as there is no mass of probability at individual values.

Types of Continuous Distributions
  • The course will cover four main continuous distributions:

    1. Normal Distribution

    2. Uniform Distribution

    3. Student -distribution

    4. Chi-Squared Distribution

  • It is highly recommended to print probability tables for these distributions, as they will be heavily utilized throughout the semester and are handy during tests.

Normal Distribution
  • Characteristics:

    • Bell-shaped: Known as the bell curve due to its shape.

    • Centered at Mean (\mu): The distribution is centered around the mean value.

    • Symmetric: The left half is a mirror image of the right half about the mean.

    • Infinite Range: It theoretically ranges from negative infinity to positive infinity.

  • Parameters:

    • Mean (\mu): Can take any value on the real number line and determines the center of the curve.

    • Standard Deviation (\sigma): Must be a positive value; larger values make the curve more spread out, while smaller values make it compact.

  • Visual Examples:

    • Curves may have the same mean but different standard deviations, showing variability in data spread.

    • Curves may have different means but the same standard deviation, illustrating the same spread around different centers.

Calculating Probabilities in Normal Distribution
  • Total probability for any random variable (X) in continuous form integrates to 1.

  • Probabilities are calculated for intervals, denoted by the area under the curve in a specified range.

    • Example 1: Employee drive distances are normally distributed with a mean (\mu) of 10 miles and standard deviation (\sigma) of 3 miles. To find P(X > 15):

    • The relevant area is shaded/dotted for distances above 15 miles.

    • Example 2: Test scores with mean 80.6 and standard deviation 6.4, finding P(X < 94.7) involves identifying the area left of that score.

  • Example 3: Lifetimes of machine components with mean 17 months and standard deviation 3.8. To find the probability that the lifetime lies between 13 and 16 months, we calculate P(13 < X < 16).

  • Important Information:

    • Total area = 1.

    • Area = probability: The larger the area represented by a shaded region, the higher the probability.

    • Empirical Rule:

    • 68% of values lie within 1 standard deviation from the mean.

    • 95% lie within 2 standard deviations.

    • Approximately 100% lie within 3 standard deviations.

  • Practice Problems and Solutions:

    • Problem 1: Test scores are normally distributed with a mean (μ\mu) of 75 and a standard deviation (σ\sigma) of 8. What is the probability that a randomly selected test score is less than 60?

    • Solution 1:

      1. Convert X=60 to a Z-score: Z=60758=158=1.875Z = \frac{60 - 75}{8} = \frac{-15}{8} = -1.875.

      2. Look up Z = -1.88 (rounded) in the Z-table or use a calculator to find the area to the left.

      3. P(X < 60) \approx P(Z < -1.88) = 0.0301.

Standard Normal Distribution
  • The standard normal distribution has μ=0\mu = 0 and σ=1\sigma = 1. The random variable is denoted as Z.

  • The Z-table provides areas to the left of Z-scores.

  • Finding Probabilities:

    • Example: P(Z < -2.84) involves looking up the value for the area left of Z = -2.84, resulting in P(Z < -2.84) = 0.0023.

    • Example: P(Z > -2.84) would be calculated using the total area law, resulting in P(Z > -2.84) = 1 - P(Z < -2.84) = 0.9977.

  • For positive Z-scores, use the same approach:

    • Example: Find P(Z < 1.15) yields 0.8749, and proceeding to find areas to the right involves similar complement calculations.

  • Practice Problems and Solutions:

    • Problem 2: Find P(-0.5 < Z < 1.75).

    • Solution 2:

      1. Find P(Z < 1.75) from the Z-table: 0.95990.9599.

      2. Find P(Z < -0.5) from the Z-table: 0.30850.3085.

      3. P(-0.5 < Z < 1.75) = P(Z < 1.75) - P(Z < -0.5) = 0.9599 - 0.3085 = 0.6514.

Finding Areas Using Symmetry
  • In normal distribution, the left and right areas correspond. If P(Z < -2.01) is determined, it is equivalent to P(Z > 2.01).

  • Example: For P(Z > 2.37), calculate using both complement and symmetry methods:

    • Look up value in Z-table and subtract from 1, or find from the negative equivalent value.

  • Practice Problem (Symmetry):

    • Problem 3: Given P(Z < -1.5) = 0.0668, what is P(Z > 1.5)?

    • Solution 3: Due to the symmetry of the normal distribution, P(Z > 1.5) = P(Z < -1.5). Therefore, P(Z > 1.5) = 0.0668.

Conversion to Standard Normal
  • When dataset values don
    conform to standard distribution (mean 0, sd 1), convert using:

    Z=XμσZ = \frac{X - \mu}{\sigma}

  • Example: For test scores with mean 78 and standard deviation 6, to find the probability a score is less than 70.5:

    • Convert to Z-score: Z=70.5786=1.25Z = \frac{70.5 - 78}{6} = -1.25.

    • Use the Z-table to establish probabilities.

  • Practice Problems and Solutions:

    • Problem 4: The weights of adult males are normally distributed with a mean of 180 lbs and a standard deviation of 20 lbs. What is the probability that a randomly selected adult male weighs more than 220 lbs?

    • Solution 4:

      1. Convert X=220 to a Z-score: Z=22018020=4020=2Z = \frac{220 - 180}{20} = \frac{40}{20} = 2.

      2. Find P(Z < 2) from the Z-table: 0.97720.9772.

      3. P(X > 220) = 1 - P(Z < 2) = 1 - 0.9772 = 0.0228.

Additional Applications of Normal Distribution
  • For different scenarios, we determine X values given probabilities: e.g., finding scores that correspond to the lowest 4% using Z-scores from the chart and reverting using the conversion formula to find specific scores.

Normal Approximation to Binomial Distribution
  • This is valid under certain conditions where: (n)(p) > 5 and (n)(q) > 5 (or sometimes 10).

  • Calculate the mean: μ=n×p\mu = n \times p and σ=(n)(p)(q)\sigma = \sqrt{(n)(p)(q)} for binomial distributions.

  • Continuity Correction is used for approximating probabilities by adjusting the X value by 0.5 in order to better align with the continuous nature of normal distributions.

Testing for Normality
  • Normality can be visually checked through histograms or QQ plots.

  • For instance:

    • A histogram that is bell-shaped indicates normality (as examined with Set A).

    • A QQ plot showing data points along a 45° line suggests normal distribution.

  • Z-scores can be calculated using methods involving rank and area percentages to check for normality.