Probability, Samples, And Research Issues in Statistics

Probability and Statistics: Overview

Probability, Samples, and Research Issues in Statistics

Chapters 8 & 9

Using Probability in Everyday Life and at Casinos

Key Concepts in Probability

Definition of Probability

Inferential Statistics: Built on notions of probabilities.
Probability: The proportion (or fraction) of times an event is likely to occur.
- In this class, always present probabilities as proportions rounded to two decimal places.

Characteristics of Probability

Probability is a number between 0 and 1.
- 1 = all possible outcomes (think 100% of outcomes).
Symbol: p is used for probability.
Probability usually written as decimal in stats, but can also be expressed as a fraction or percentage.
Example of p-value in psychology:
- If p < 0.05, the probability of the event occurring is less than 0.05.

Calculating Probability

General Steps to Calculate Probability

Determine the number of possible successful outcomes (i.e., what you’re looking for).
Determine the number of all possible outcomes.
Divide the number of possible successful outcomes by the number of all possible outcomes.

Examples of Probability Using Cards and Dice

Card Examples

Example 1: Probability of choosing an ace.
- Number of Aces: 4
- Total Cards in Deck: 52
- Probability: P(Ace) = \frac{4}{52} = 0.08
Example 2: Probability of choosing a spade.
- Number of Spades: 13
- Total Cards: 52
- Probability: P(Spade) = \frac{13}{52} = 0.25

Dice Probability

Dice probabilities are calculated based on the total combinations of outcomes when rolling two dice.
Probability examples shown in a chart for outcomes ranging from 2 to 12 or events associated with rolling two dice.

Addition Rule in Probability

General Addition (OR) Rule

Mutually Exclusive Outcomes: You can add the probabilities when you have mutually exclusive outcomes.
Formula:
- Pr(A \text{ or } B) = Pr(A) + Pr(B)

Requirements for Addition Rule

Events are Mutually Exclusive: They cannot occur together.
- Example: A single person’s birth month cannot be both January and February.
Addition Rule: Probability of being born in either January OR February: P(January) + P(February)

Examples of Addition Rule

Example 1:
- Pr(A \text{ or } B) = Pr(A) + Pr(B)
- P(Heads \text{ or } Tails) = P(Heads) + P(Tails) = 0.5 + 0.5 = 1.00
Example 2:
- P(Born \text{ in November or December}) = P(Nov.) + P(Dec.) = \frac{1}{12} + \frac{1}{12} = 0.17

Deck of Cards Distribution

Features:
- 52 Cards total.
- 26 Red and 26 Black.
- Each suit (hearts, diamonds, spades, clubs) contains one ace, king, queen, and jack, as well as cards numbered 2-10.

Non-Mutually Exclusive Events

Overlap Adjustment:
- Pr(Queen \text{ or } Red) = Pr(Queen) + Pr(Red) - Pr(Queen \text{ and } Red)
- Example Calculation:
- Pr(Queen) = \frac{4}{52}
- Pr(Red) = \frac{26}{52}
- Pr(Red Queen) = \frac{2}{52}
- Result: \frac{4}{52} + \frac{26}{52} - \frac{2}{52} = \frac{28}{52} = 0.54

Multiplication (AND) Rule in Probability

General Multiplication Rule

When you have independent outcomes: Multiply the probabilities of individual outcomes to find the probability of them occurring together.
Independent Events: The occurrence of one event does not affect the other.
Formula:
- Pr(A \text{ and } B) = [Pr(A)] [Pr(B)]

Examples of Multiplication Rule

Probability of flipping 2 heads in 2 flips of a coin:
- P(Heads \text{ and } Heads) = \frac{1}{2} \times \frac{1}{2} = \frac{1}{4} = 0.25
Probability of flipping 3 heads in 3 flips:
- P(Heads \text{ and } Heads \text{ and } Heads) = \frac{1}{2} \times \frac{1}{2} \times \frac{1}{2} = \frac{1}{8} = 0.13
Probability that a husband and wife are both born in July:
- P(Husband \text{ July and } Wife \text{ July}) = \frac{1}{12} \times \frac{1}{12} = \frac{1}{144} = 0.01

Conditional Probability Rule

Dependent Events: The occurrence of one event affects the probability of the other.
Example: Finding the probability of being a senior AND a psychology major.
- Given values:
- P(Senior) = 0.20 (20% of students are seniors)
- P(Psych) = 0.70 (70% of students are psych majors)
- P(Psych, given Senior) = 0.50 (50% of seniors are psych majors)
- Calculation:
- P(Senior \text{ and } Psych) = P(Senior) \times P(Psych, given Senior) = 0.20 \times 0.50 = 0.10

Replacement vs. Without Replacement in Probability

With Replacement:
- Probability of drawing a king and then another king:
- P(King) \times P(King) = \frac{4}{52} \times \frac{4}{52} = 0.006
Without Replacement:
- Probability of drawing a king and then another king:
- P(King) \times P(King, given King) = \frac{4}{52} \times \frac{3}{51} = 0.005

Understanding Probability and Normal Distribution

Role of Probability in Statistics

View any outcome in the context of potential outcomes that could occur by chance.
- High probability of occurrence = considered common (e.g., picking a red card).
- Low probability of occurrence = considered rare (e.g., picking the ace of Spades).

Distribution: Normal Distribution

Normal distribution as a probability distribution can be viewed as a theoretical value space.
- The percentage of scores between two Z-scores equals the probability of selecting a score within those two Z-scores.
- Z-scores: -2, 0, 1, 2.

Standard Normal Curve

Classifies outcomes into common versus rare based on statistical significance thresholds.
- Probability assessments of outcomes categorized into common (p > 0.05) and rare (p < 0.05).

Common Outcomes

Outcome seen to occur with high probability, no evidence of special circumstances (e.g., common IQ score 100)
- Lack of statistical significance: p-value > 0.05.

Rare Outcomes

Outcome likely indicative of a significant effect, potential impact of treatment or manipulation.
- Example: Observed differences yielding a p < 0.05, indicating statistical significance.

Sampling Distribution of the Mean and Confidence Intervals

Review of Inferential Statistics

Enable inferences and hypothesis testing about entire populations based on samples, estimating population mean μ using sample mean \bar{x}.
- Assuming a representative sample, population should mirror sample characteristics.

Definition of Sampling Distribution of the Mean

A probability distribution of means from all possible random samples of a certain size from a population, typically many random samples taken.

Central Limit Theorem

Regardless of population shape, the mean's sampling distribution approximates a normal curve when sample size is satisfactory (usually n > 20).
The larger the sample size, the closer the sample averages align with a normal distribution.

Sampling Error

Various results from multiple samples of the same size from the same population, hence they might deviate from the population mean do to random chance.
- Not a sampling mistake; inherent variation referred to as sampling error.

Standard Error of the Mean

Represents how much sample means deviate around their mean.
- Formula: \text{Standard Error} = \frac{\sigma}{\sqrt{n}}
- Where:
  - σ is the population standard deviation.
  - n is the number of observations in the sample.
  - If σ is unknown, it can often be estimated using the sample standard deviation s.

Properties of the Sampling Distribution of the Mean

The mean of the sampling distribution equals the population mean: \bar{x} = μ.
Approximates a normal distribution when sample size is sufficiently large (Central Limit Theorem).

Symbols to Know

Population Mean: μ
Sample Mean: \bar{x}
Population Standard Deviation: σ
Sample Standard Deviation: s
Standard Error of the Mean: σ_\bar{x}

Uses of Sampling Distributions

Estimate population mean.
Determine if a sample mean significantly differs from the known population mean.
Compare differences between sample means to ascertain chance versus experimental treatment effects.
Calculate probabilities of receiving particular sample means based on distributions.

Confidence Intervals

When sampling from a population, the sample mean \bar{x} estimates the population mean μ.
With a normal distribution and sufficient sample size, \bar{x} will align with μ.
Each sample mean can provide a range to estimate the population mean based on the sampling distribution.

Building Confidence Intervals

Leverage the normal distribution of the sample mean to revert from P to (1 - \alpha) percentages (90%, 95%, or 99%) using Z-scores.
The Z-score correlations for a 95% confidence interval yields Z-scores of approximately -1.96 and 1.96.

Confidence Interval Examples

Example 1:
- Scenario: Alzheimer's patients' average stage IV sleep recorded for a sample of 61, mean = 48 minutes, σ = 14.
- Confidence Interval:
- = \bar{x} \pm (Z^*) (\frac{σ}{\sqrt{n}})
- Calculate:
  - = 48 \pm (1.96)(\frac{14}{\sqrt{61}})
  - Result: 44.49 to 51.51 minutes of stage IV sleep.
Example 2:
- Testing student knowledge regarding international events with a sample of 30, mean = 58, standard error = 3.2.
- National reported mean = 65.
- Compute a 99% confidence interval:
- Using Z* = 2.58.
- Calculation:
  - = 58 \pm (2.58)(3.2)
  - Result: 49.74 to 66.26, comparing students' knowledge to national standards.