Entire Exam 2 notes
Chapter Four: Introductory Probability
Introduction to Probability
Probability pertains to the likelihood of an event occurring based on repeated experiments under identical conditions.
The speaker expresses a passion for probability, highlighting the contrast with students' general discomfort with the topic.
The Basics of Probability
Probability Notation:
Notated as P(E), where "E" is any event (e.g., P(friends getting married), P(dying on the way home)).
Probability Range:
Probability values must fall between 0 and 1:
0 = impossible event
1 = certain event
Example of improper probability: No negative probabilities (e.g., -15% is invalid).
Mutually Exclusive Events
Definition: Events that cannot happen simultaneously.
Example: Flipping a coin results in either heads or tails - not both.
In contexts like sports, if one team wins, the other team can't win simultaneously (e.g., Eagles vs Chiefs in the Super Bowl).
Probability Calculation:
For mutually exclusive events, the sum of probabilities equals 1.
Example: Coin flip: P(heads) + P(tails) = 0.5 + 0.5 = 1.0.
Other example: If P(Chiefs winning) = 60%, then P(Eagles winning) = 40%.
Independent Events
Definition: The occurrence of one event does not affect the occurrence of another event.
Example: Someone eating at Chick-fil-A doesn't affect someone else eating at McDonald's.
Probability Calculation:
For independent events, P(E1 and E2) = P(E1) * P(E2).
Example: If P(E1) = 0.65 and P(E2) = 0.45, then P(E1 and E2) = 0.65 * 0.45 = 0.2925.
Calculating Probability for Events
Probability of Either Event:
Calculation formula: P(E1 or E2) = P(E1) + P(E2) - P(E1 and E2).
Problem-solving involves defining probabilities and accounting for overlaps.
Example: For the probability of getting a heads or tails in two coin flips, consider possibilities to avoid double counting.
Sample Space
Definition: The set of all possible outcomes of an experiment.
Example: Flipping a coin twice results in these outcomes: HH, HT, TH, TT.
To calculate possibilities, multiply outcomes by the number of flips (e.g., 2 outcomes for the first flip x 2 for the second = 4 total outcomes).
Relative Frequency Assessment
Definition: An empirical way to estimate probabilities based on past occurrences.
Example: A Starbucks manager might say the probability of a customer ordering a caffeinated drink is 1570 out of 2250 total sales.
Umbrella issue: Insufficient models may miss key data affecting probabilities, such as time of day for drink sales.
Subjective Probability Assessment
Definition: A personal estimation of an event's chance occurring, often without statistical backing, like predicting weather or game outcomes based on gut feeling.
Practical Example of Drawing Cards
Problem statement: What is the probability of drawing a red card or an ace from a standard deck of 52 cards?
Count of red cards = 26 (13 hearts, 13 diamonds).
Count of aces = 4 (one from each suit).
Adjust for double counting the 2 red aces: Use P(Red) + P(Ace) - P(Red Ace).
Example calculations for clarity on adjustments in shared probabilities.
Conclusion
The chapter covers foundational aspects of probability essential for further statistical understanding.
The speaker encourages relatable examples to reinforce concepts while preparing for the exam.
Introduction
Class discussion focuses on probability and relevant formulas for upcoming test.
Independent Events
Probability of two independent events:
Given probabilities: P(E1) = 0.6, P(E2) = 0.45.
Formula for both events occurring: P(E1 and E2) = P(E1) * P(E2).
Example calculation: 0.6 * 0.45 = 0.27.
Next question involves finding the probability of either event occurring:
Formula: P(E1 or E2) = P(E1) + P(E2) - P(E1 and E2).
Calculation example:
P(E1 or E2) = 0.6 + 0.45 - 0.27 = 0.78.
Mutually Exclusive vs Independent Events
For mutually exclusive events, P(E1 and E2) = 0 because they cannot occur together.
Venn diagram comparison:
Mutually exclusive: no overlap between events.
Independent: overlap is possible; hence overlap must be subtracted to avoid double counting when calculating probabilities.
Example: Drawing Cards
Probability focused on drawing cards:
P(Ace) = 4/52 = 7.7%.
P(Red card) = 26/52 = 50%.
P(Red card and Ace): Only two possible cards (Ace of Diamonds and Ace of Hearts) = 2/52 = 3.8%.
Calculation of P(Red card or Ace): 26 + 4 - 2 = 28/52 = 53.8%.
Conditional Probability
Definition: Probability of an event given another event has occurred (dependent events).
Relevant for situations like sports outcomes where one event affects the probability of another:
Example: Calculating probability that Chiefs win Super Bowl given they win AFC Championship.
Formula: P(E1|E2) = P(E1 and E2) / P(E2).
Example Calculation
Example: Given Aces and Red Cards from a deck:
Event E1: Drawing Ace of Diamonds.
Event E2: Drawing a Red Card.
Conditional probability calculation where outcomes are dependent based on previous draws.
Test Preparation
Important elements to focus on for the test:
Probability questions involving independent and mutually exclusive events.
Conditional probability and practical card drawing scenarios.
Sampling with/without Replacement
Highlight concept differences:
Sampling with replacement: probabilities remain unchanged from draw to draw.
Sampling without replacement: Probability changes depending on previous outcomes.
Conclusion
Key concepts of probability relevant for the test include:
Independent vs mutually exclusive events, conditional probability, and understanding sample space.
Practice calculations and probabilities with provided examples.
Final Comments
Reminder about the significance of understanding when sampling with or without replacement impacts probability calculations.
Introduction to Probability Distributions
Types of Probability Distributions:
Binomial Distribution
Poisson Distribution
Hypergeometric Distribution
Focus Today: Binomial Probability Distribution
Discrete Variables
Definition of Discrete Variables:
Countable number of unique values (e.g., 0, 1, 2, 3, etc.)
Cannot take on fractional values (e.g., 4.75 is not valid)
Continuous Variables:
Can assume uncountable values (discussed in Chapter 6)
Binomial Probability Distribution
Example: Coin Flipping Experiment
Experiment: Flip a coin 3 times
Sample Space:
3 heads, 2 heads, 1 head, 0 heads
Probability Distribution for Heads on 3 Flips:
Exactly 0 heads: 1 way (TTT) = 1/8 = 0.125 (12.5%)
Exactly 1 head: 3 ways (HTT, THT, TTH) = 3/8 = 0.375 (37.5%)
Exactly 2 heads: 3 ways (HHT, HTH, THH) = 3/8 = 0.375 (37.5%)
Exactly 3 heads: 1 way (HHH) = 1/8 = 0.125 (12.5%)
Importance of Probability Calculations
Highlighted Material: Important for the in-class portion of the exam
Graphing Probability Distribution
Creating Graphs in Excel:
Insert recommended charts, choose vertical bar chart
Adjust axes labels to match probabilities of 0, 1, 2, 3 heads
Set y-axis to show probabilities from 0 to 1 for clarity
Expected Value of a Discrete Variable
Expected Value (Mean/Average):
Formula: E(X) = Σ (x_i * P(x_i))
Calculation Example for Coin Flips:
Expected values calculated from probabilities for 0, 1, 2, 3 heads
Result: Expected Value = 1.5
Variance and Standard Deviation
Variance Calculation
Formula: Variance = Σ [(x_i - E(X))^2 * P(x_i)]
Calculation steps for 0, 1, 2, 3 heads using probabilities from earlier
Standard Deviation
Determination: Standard Deviation = √(Variance)
Example Result: √0.75 = 0.866
Characteristics of the Binomial Distribution
Properties Required for Binomial Distribution:
n identical trials
Two mutually exclusive outcomes (success or failure)
Trials are independent
Probability of success (p) is constant
Probability of Failure (q): q = 1 - p
Binomial Probability Formula
Formula for Probability:
P(X = x) = (n! / (x! (n - x)!)) * p^x * q^(n-x)
Example: Calculate probabilities for set n and x values, using factorials
Combinations Formula
Combination Formula: C(n, x) = n! / (x!(n - x)!)
Example: Determine combinations for choosing successes in trials (e.g., choosing 2 successes from 3 trials)
Practical Application Using Excel
Utilizing Excel Functions:
Using
BINOM.DIST
for exact probabilitiesEntering the number of successes, trials, and probabilities for calculations
Cumulative Probabilities
Definition: Probability that X is less than or equal to a certain value
Calculation in Excel:
BINOM.DIST
function with cumulative set to TRUE
Probability Greater Than or Less Than
Greater Than: P(X > x) = 1 - P(X ≤ x)
Less Than: P(X < x) = P(X ≤ x-1)
Binomial Distribution Graphs
Effect of p (Probability of Success):
As p approaches 0.5, distribution becomes more symmetric regardless of n (number of trials)
Visual Representation: Graphs illustrating probability distributions based on varying p values and n
Summary and Next Steps
Preparation for Next Class: Introduction to Poisson Distribution
Note for Test Preparation: Formulas and values to be provided
Reminder: Watch videos for further assistance with upcoming homework assignments.
Discrete Probability Distributions
Overview
Focus on Poisson and Hypergeometric distributions.
Assignments and solutions available on Canvas; classwork assistance provided.
Poisson Distribution
Definition: Used for modeling count data (e.g., number of successes in a time segment).
Examples:
Scholarship offers for high school athletes from universities.
Number of emergency calls in an hour at a hospital.
Potholes in a specified highway segment.
Key Characteristics
Count data consists of whole numbers and cannot be fractional (e.g., 1 or 2 scholarships).
Assumption: Average number of successes in one segment (denoted by lambda, λ).
Trials are independent, with consistent probability of success across segments.
Application Example
Scenario: Average bank customers arriving is 16 per hour (λ = 16).
Calculating Probabilities:
Use formula: P(x) = (λ * t)^x * e^(-λ * t) / x!
Where:
λ = average number of successes
t = time segment
x = number of successes
Practical Calculation Using Excel
Setup: Input average, time, and number of successes in designated cells.
Formula for Exact Probability:
=POISSON.DIST(x, λ*t, FALSE)
for exact occurrences;=POISSON.DIST(x, λ*t, TRUE)
for cumulative probability.
Assumptions and Limitations
Both expected value and variance in Poisson distribution are equal to λ * t.
Assumes mean and variance equality, which often doesn’t hold in real scenarios.
Poisson is less practical; consider using the negative binomial distribution instead for systems where they differ.
Hypergeometric Distribution
Usage: Best for dependent trials where the probability of success changes from trial to trial.
Overview:
Used for scenarios where items are chosen without replacement.
Important in binary outcomes but with dependent trials.
Hypergeometric Formula
Formula: P(X = x) = [C(N - K, n - k) * C(K, k)] / C(N, n)
Where:
N = population size
K = total number of successes in the population
n = sample size
k = number of observed successes in the sample
Example Scenario
Context: Firm downsizing with a set population of employees (30 total).
Layoffs randomly select 10 employees; success defined as female staff members.
Calculate probabilities for different outcomes (e.g., eight women laid off).
Excel Calculations for Hypergeometric Distribution
Set required population values and sample sizes in Excel for easy probability calculations.
Use formula in Excel to streamline calculations without needing to compute manually.
Comparison of Distributions
Binomial Distribution: Applicable for independent trials with binary data.
Poisson Distribution: For independent trials with count data.
Hypergeometric Distribution: For dependent trials with either count or binary data.
Distinguish use cases based on the independence/dependency of trials and nature of the data.
Introduction
Objective: To cover Concepts of Continuous Probability Distributions
Schedule: Finish Chapter 6 and go through exam review this week.
Overview of Probability Distributions
Discrete Probability Distributions reviewed in Chapter 5: Whole number variables (e.g., binomial, Poisson)
Continuous Probability Distributions (Chapter 6): Three key types to discuss:
Normal Distribution
Uniform Distribution
Exponential Distribution
Importance of Normal Distribution
The normal distribution is the most commonly used distribution in statistics
Variables that are normally distributed allow for various statistical techniques
Examples of continuous variables:
Age can be measured in decimal, e.g., 42.331507 years.
Income can also take decimal values.
Understanding Continuous Variables
Continuous variables can take any real number within a range.
Probability of observing an exact value (e.g., age) is approximately zero.
Typically, we look for probabilities within a range (e.g., the probability of being between 42 and 43 years old).
Normal Distribution Properties
Use of the normal distribution formula:
[ f(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}
] (Not required for calculations, better for graphing)
Standard properties:
Mean = Median = Mode at the peak,
The curve is symmetric about the mean.
50% of data points fall below the mean.
Standard Deviation and Variance
Examples of how the standard deviation affects the normal curve:
Smaller standard deviations result in a steeper curve.
Larger standard deviations lead to a flatter curve.
Sample mean is an unbiased estimate of the population mean.
Variance of the sample mean: (\sigma^2/n), where (n) = sample size.
Larger sample sizes yield less variable sampling distributions.
Z-Scores
Calculation of Z-scores to standardize any normal distribution variable:
[ Z = \frac{X - \mu}{\sigma}
]Example given with response times in a medical emergency is shown.
Z-score interpretation in context of response times:
Z-score helps determine how many standard deviations away from the mean a particular response time is.
Normal Distribution and Excel
Instructions for using Excel to calculate probabilities:
Use
NORM.DIST()
function for cumulative probabilities.
Example showed calculating the areas under the curve using Excel.
Empirical Rule Recap
68% of data falls within one standard deviation,
95% within two, and
99.7% within three standard deviations around the mean.
Uniform Distribution
Definition: Continuous probability where all outcomes are equally likely within a specified range.
Formula: ( f(x) = \frac{1}{b-a} ) if ( x ) is between ( a ) and ( b ) (inclusive)
Example applied to pine tree diameter growth between 1 and 4 inches.
Cumulative Probability for Uniform Distribution
To find a probability that ( x ) is greater than a certain value, use cumulative probabilities. This will yield:
( P(X > x) = 1 - P(X \leq x) )
Example with pine tree growth assessments explained.
Exponential Distribution
Used to measure time between events (e.g., time until the next customer arrives).
Defining Characteristics:
Mean ( \mu = \frac{1}{\lambda} )
Standard Deviation equals Mean: also ( \sigma = \frac{1}{\lambda} )
Visual representation noted for exponential decay.
Summary and Review
Emphasized the importance of understanding different probability distributions before diving into applying them, including how they relate to real-world situations.
Overview of Exam Instructions
The exam includes a take-home portion that consists of multiple-choice and short-answer questions.
The in-class portion must be completed on Tuesday.
Students are required to submit the take-home exam during the in-class portion.
Exam Format
Take-Home Portion
Starts with question 20 and includes multiple-choice questions
Ends with short-answer questions
Utilizes chapters 5 and 6 materials primarily
In-Class Portion
Consists mainly of multiple-choice questions
Related to chapters 4, 5, and 6 assignments
Students should bring calculators
Exam Study Material
Excel Worksheets
Essential for both sections of the exam, especially chapters 5 and 6
Utilize provided worksheets on Canvas for calculations
Worksheets contain information necessary for problems related to binomial, Poisson, and normal distributions
Key Chapters and Content
Chapter 5
Reference the binomial and hypergeometric distributions
Important to understand how to manipulate given Excel sheets for problem-solving
Chapter 6
Focus on the normal distribution and how to calculate probabilities
Key to understand cumulative probability formulas for exponential distribution
Key Formulas and Concepts
Binomial Distribution
Formula for Expected Value: E(X) = n * p
Variance Formula: Var(X) = n * p * (1 - p)
Poisson and Exponential Distributions
Key Characteristics: Mean and variance are equal for Poisson; for exponential, the mean and standard deviation are equal.
Mean (Expected Value) of Exponential Distribution: 1/lambda
Probability Problems
Typical Question Formats
Students will be expected to calculate probabilities based on given statistics using the appropriate distribution formulas
Be able to differentiate between mutually exclusive and independent events.
Example Question
Given n = 8 and p = 0.37 in a binomial distribution scenario:
Calculate expected value and variance
Usage of Technology
Students can use Excel or Google Sheets for calculations
Caution against using external human resources for answers
Preparing for the Exam
Complete highlighted problems from assignments on Canvas
Review textbook examples pertaining to the take-home exam
Additional Exam Tips
Bring necessary materials including calculators and answer sheets
Stay organized with answer sheets and questions
Revisit class notes and Excel worksheets before the exam
Questions and Clarifications
Open for Q&A at the end of the review session
Ensure copy of the take-home exam is obtained either in class or via email if watching online.
Prepare any questions in advance to maximize the effectiveness of the Q&A session.