Comprehensive Guide to Continuous and Discrete Probability Distributions
Standardization of the Normal Density Function
- Difficulty of Direct Integration: The density function for a normal distribution is highly complex. To avoid the mathematical difficulty of integrating this function directly to find probabilities, the standard procedure is to convert a probability statement concerning a random variable x into a z-score.
- The Z-Transformation Formula: To standardize a value, use the formula:
z=σx−μ
- x: The specific value of the random variable.
- μ: The population mean.
- σ: The population standard deviation.
- Parameters of the Standard Normal Distribution: Once converted to the z-scale, the resulting graph (the z-graph) always has fixed parameters:
- The mean (μ) is always equal to 0.
- The standard deviation (σ) is always equal to 1.
- Preservation of Area: The shaded region (representing probability) remains the same under the x curve and the standardized z curve because the values have been standardized. The area under the density function represents the probability.
Probability Calculations Using the Standard Normal Table
- Table Format Requirements: To use the normal distribution tables provided, the probability statement must be expressed in a "less than" format (cumulative from the left).
- Precision and Rounding Rules:
- The z-score on the table typically extends to two decimal places.
- If a calculation results in a z-score with three decimal places, it must be rounded to two decimal places to match the table's level of precision.
- Specific Calculation Scenarios:
- Area to the Left (Less Than): Used directly from the table format.
- Area to the Right (Greater Than): Calculated as 1 - P(Z < z).
- Range Probabilities (Between Two Values): If finding the probability that z lies between z1 and z2 (where z_1 < Z < z_2), the formula is the probability of the upper bound minus the probability of the lower bound:
P(z_1 < Z < z_2) = P(Z < z_2) - P(Z < z_1)
- Self-Verification and Common Sense:
- Students should always double-check their results against their visual graph.
- If the shaded region represents more than half of the curve, the calculated probability must be greater than 0.5.
Finding Variable Values from Given Areas (Inverse Normal)
- Direction of Area: When given a probability (area), identify if it is representing the area to the left or the area to the right of the variable.
- If a value za is given where the area a is to the left, and it does not align with table formatting, the speaker notes you may need to calculate 1−a to find the corresponding value in the table.
- Averaging Z-Scores (Tracking): If a specific area value falls exactly between two entries on the normal table, you must take both corresponding z-scores and average them.
- This process results in a z-score that extends to three decimal places (e.g., the midpoint between 1.96 and 1.97 would be 1.965).
- Solving for the Random Variable x: Once the appropriate z-score is found, use the known mean (μ) and standard deviation (σ) to solve for x using a rearrangement of the standardization formula:
x=μ+(z×σ)
Classification of Density Functions
- Continuous Density Functions:
- Identification: You will know a problem is continuous if the term "density function" is mentioned.
- Individual Point Probabilities: In the world of continuous density functions, the probability of a specific point existing (e.g., P(X=x)) is always zero. This is a fundamental characteristic of continuous data.
- Types Observed: Normal, Exponential (calculated using formulas for "surface learning"), Uniform, and Rectangular.
- Types to Ignore: For the purposes of current testing, ignore distribution types such as Chi-squared and F-distributions (sections 4, 5, and 6).
The Discrete Distribution Pathway
- Identification: If a problem does not mention it is normal, exponential, uniform, or rectangular, it is likely a discrete distribution problem.
- Sample Space Creation: If events within the problem are dependent, you must create your own sample space.
- Identifying Given Data Types:
- Marginals: If the given values are marginal probabilities, construct a probability tree.
- Intersections: If the given values are intersections, construct a cross-classification table.
- Creating a Probability Distribution: Use the sample space to list all possible values of the random variable x and their associated probabilities.
- Required Calculations:
- Expected Value (Mean): Calculation of E(x) is required.
- Variance and Standard Deviation: While these can be calculated, they are not required for exams because they are time-consuming to compute manually.
Specialized Discrete Models (Binomial and Poisson)
- Binomial Distribution: Use this if the specific properties of a binomial experiment are met. The distribution is defined by the formula:
P(x)=(xn)px(1−p)n−x
- Table access: Binomial tables can be used as a shortcut for these calculations.
- Poisson Distribution: Use this for a one-parameter experiment where μ (the average rate of success) is given over a specific interval. The formula is:
P(x)=x!e−μμx
- Table access: Poisson tables are available to simplify calculations.
- Other Models: Hypergeometric and geometric distributions exist, but they will not be included on the test.
Exam Requirements and Problem-Solving Strategy
- Step 1: High-Level Classification: Determine if the variable is continuous or discrete. This dictates the entire path for the solution.
- Step 2: Documentation:
- Probability statements must accompany all answers.
- Full work must be shown.
- Graphs are mandatory and must be accurate to the problem.
- Step 3: Execution: Use formulas or tables depending on whether properties of specific distributions (like Binomial or Poisson) are met, rather than taking the "hard route" of manually creating sample spaces for independent events.
Questions & Discussion
- Student/Audience Interaction:
- Instructor: "What are you gonna have to do?"
- Audience: "Track."
- Instructor: "Yeah. You find the z and you average the z. Okay?"