Probability, Samples, And Research Issues in Statistics
Probability and Statistics: Overview
Probability, Samples, and Research Issues in Statistics
Chapters 8 & 9
Using Probability in Everyday Life and at Casinos
Key Concepts in Probability
Definition of Probability
Inferential Statistics: Built on notions of probabilities.
Probability: The proportion (or fraction) of times an event is likely to occur.
In this class, always present probabilities as proportions rounded to two decimal places.
Characteristics of Probability
Probability is a number between 0 and 1.
1 = all possible outcomes (think 100% of outcomes).
Symbol: p is used for probability.
Probability usually written as decimal in stats, but can also be expressed as a fraction or percentage.
Example of p-value in psychology:
If p < 0.05, the probability of the event occurring is less than 0.05.
Calculating Probability
General Steps to Calculate Probability
Determine the number of possible successful outcomes (i.e., what you’re looking for).
Determine the number of all possible outcomes.
Divide the number of possible successful outcomes by the number of all possible outcomes.
Examples of Probability Using Cards and Dice
Card Examples
Example 1: Probability of choosing an ace.
Number of Aces: 4
Total Cards in Deck: 52
Probability: P(Ace) = \frac{4}{52} = 0.08
Example 2: Probability of choosing a spade.
Number of Spades: 13
Total Cards: 52
Probability: P(Spade) = \frac{13}{52} = 0.25
Dice Probability
Dice probabilities are calculated based on the total combinations of outcomes when rolling two dice.
Probability examples shown in a chart for outcomes ranging from 2 to 12 or events associated with rolling two dice.
Addition Rule in Probability
General Addition (OR) Rule
Mutually Exclusive Outcomes: You can add the probabilities when you have mutually exclusive outcomes.
Formula:
Pr(A \text{ or } B) = Pr(A) + Pr(B)
Requirements for Addition Rule
Events are Mutually Exclusive: They cannot occur together.
Example: A single person’s birth month cannot be both January and February.
Addition Rule: Probability of being born in either January OR February: P(January) + P(February)
Examples of Addition Rule
Example 1:
Pr(A \text{ or } B) = Pr(A) + Pr(B)
P(Heads \text{ or } Tails) = P(Heads) + P(Tails) = 0.5 + 0.5 = 1.00
Example 2:
P(Born \text{ in November or December}) = P(Nov.) + P(Dec.) = \frac{1}{12} + \frac{1}{12} = 0.17
Deck of Cards Distribution
Features:
52 Cards total.
26 Red and 26 Black.
Each suit (hearts, diamonds, spades, clubs) contains one ace, king, queen, and jack, as well as cards numbered 2-10.
Non-Mutually Exclusive Events
Overlap Adjustment:
Pr(Queen \text{ or } Red) = Pr(Queen) + Pr(Red) - Pr(Queen \text{ and } Red)
Example Calculation:
Pr(Queen) = \frac{4}{52}
Pr(Red) = \frac{26}{52}
Pr(Red Queen) = \frac{2}{52}
Result: \frac{4}{52} + \frac{26}{52} - \frac{2}{52} = \frac{28}{52} = 0.54
Multiplication (AND) Rule in Probability
General Multiplication Rule
When you have independent outcomes: Multiply the probabilities of individual outcomes to find the probability of them occurring together.
Independent Events: The occurrence of one event does not affect the other.
Formula:
Pr(A \text{ and } B) = [Pr(A)] [Pr(B)]
Examples of Multiplication Rule
Probability of flipping 2 heads in 2 flips of a coin:
P(Heads \text{ and } Heads) = \frac{1}{2} \times \frac{1}{2} = \frac{1}{4} = 0.25
Probability of flipping 3 heads in 3 flips:
P(Heads \text{ and } Heads \text{ and } Heads) = \frac{1}{2} \times \frac{1}{2} \times \frac{1}{2} = \frac{1}{8} = 0.13
Probability that a husband and wife are both born in July:
P(Husband \text{ July and } Wife \text{ July}) = \frac{1}{12} \times \frac{1}{12} = \frac{1}{144} = 0.01
Conditional Probability Rule
Dependent Events: The occurrence of one event affects the probability of the other.
Example: Finding the probability of being a senior AND a psychology major.
Given values:
P(Senior) = 0.20 (20% of students are seniors)
P(Psych) = 0.70 (70% of students are psych majors)
P(Psych, given Senior) = 0.50 (50% of seniors are psych majors)
Calculation:
P(Senior \text{ and } Psych) = P(Senior) \times P(Psych, given Senior) = 0.20 \times 0.50 = 0.10
Replacement vs. Without Replacement in Probability
With Replacement:
Probability of drawing a king and then another king:
P(King) \times P(King) = \frac{4}{52} \times \frac{4}{52} = 0.006
Without Replacement:
Probability of drawing a king and then another king:
P(King) \times P(King, given King) = \frac{4}{52} \times \frac{3}{51} = 0.005
Understanding Probability and Normal Distribution
Role of Probability in Statistics
View any outcome in the context of potential outcomes that could occur by chance.
High probability of occurrence = considered common (e.g., picking a red card).
Low probability of occurrence = considered rare (e.g., picking the ace of Spades).
Distribution: Normal Distribution
Normal distribution as a probability distribution can be viewed as a theoretical value space.
The percentage of scores between two Z-scores equals the probability of selecting a score within those two Z-scores.
Z-scores: -2, 0, 1, 2.
Standard Normal Curve
Classifies outcomes into common versus rare based on statistical significance thresholds.
Probability assessments of outcomes categorized into common (p > 0.05) and rare (p < 0.05).
Common Outcomes
Outcome seen to occur with high probability, no evidence of special circumstances (e.g., common IQ score 100)
Lack of statistical significance: p-value > 0.05.
Rare Outcomes
Outcome likely indicative of a significant effect, potential impact of treatment or manipulation.
Example: Observed differences yielding a p < 0.05, indicating statistical significance.
Sampling Distribution of the Mean and Confidence Intervals
Review of Inferential Statistics
Enable inferences and hypothesis testing about entire populations based on samples, estimating population mean μ using sample mean \bar{x}.
Assuming a representative sample, population should mirror sample characteristics.
Definition of Sampling Distribution of the Mean
A probability distribution of means from all possible random samples of a certain size from a population, typically many random samples taken.
Central Limit Theorem
Regardless of population shape, the mean's sampling distribution approximates a normal curve when sample size is satisfactory (usually n > 20).
The larger the sample size, the closer the sample averages align with a normal distribution.
Sampling Error
Various results from multiple samples of the same size from the same population, hence they might deviate from the population mean do to random chance.
Not a sampling mistake; inherent variation referred to as sampling error.
Standard Error of the Mean
Represents how much sample means deviate around their mean.
Formula: \text{Standard Error} = \frac{\sigma}{\sqrt{n}}
Where:
σ is the population standard deviation.
n is the number of observations in the sample.
If σ is unknown, it can often be estimated using the sample standard deviation s.
Properties of the Sampling Distribution of the Mean
The mean of the sampling distribution equals the population mean: \bar{x} = μ.
Approximates a normal distribution when sample size is sufficiently large (Central Limit Theorem).
Symbols to Know
Population Mean: μ
Sample Mean: \bar{x}
Population Standard Deviation: σ
Sample Standard Deviation: s
Standard Error of the Mean: σ_\bar{x}
Uses of Sampling Distributions
Estimate population mean.
Determine if a sample mean significantly differs from the known population mean.
Compare differences between sample means to ascertain chance versus experimental treatment effects.
Calculate probabilities of receiving particular sample means based on distributions.
Confidence Intervals
When sampling from a population, the sample mean \bar{x} estimates the population mean μ.
With a normal distribution and sufficient sample size, \bar{x} will align with μ.
Each sample mean can provide a range to estimate the population mean based on the sampling distribution.
Building Confidence Intervals
Leverage the normal distribution of the sample mean to revert from P to (1 - \alpha) percentages (90%, 95%, or 99%) using Z-scores.
The Z-score correlations for a 95% confidence interval yields Z-scores of approximately -1.96 and 1.96.
Confidence Interval Examples
Example 1:
Scenario: Alzheimer's patients' average stage IV sleep recorded for a sample of 61, mean = 48 minutes, σ = 14.
Confidence Interval:
= \bar{x} \pm (Z^*) (\frac{σ}{\sqrt{n}})
Calculate:
= 48 \pm (1.96)(\frac{14}{\sqrt{61}})
Result: 44.49 to 51.51 minutes of stage IV sleep.
Example 2:
Testing student knowledge regarding international events with a sample of 30, mean = 58, standard error = 3.2.
National reported mean = 65.
Compute a 99% confidence interval:
Using Z* = 2.58.
Calculation:
= 58 \pm (2.58)(3.2)
Result: 49.74 to 66.26, comparing students' knowledge to national standards.