Probability and Random Sampling

Probability

Definition: Probability is the ratio of the number of specific outcomes to the number of all possible outcomes.
- Expressed as: Probability\ of\ A = \frac{number\ of\ A\ outcomes}{number\ of\ all\ possible\ outcomes}
Example:
- A jar contains 50 marbles: 10 red and 40 blue.
- The probability of drawing a red marble is: P(red) = \frac{10}{50} = 0.2 = 20\%.
- Probability is equivalent to proportion.

Probability Range

Probability values range from 0% to 100%.
- 100% Probability: The event will definitely occur.
- 0% Probability: The event will definitely not occur.

Random Sample

Definition: A sample selected from a population where every individual has an equal and independent chance of being chosen.
Two Important Principles:
- Every individual has an equal chance of being selected.
- The probability of selecting an individual remains constant, regardless of who was previously selected.
  - This implies independence between selections.

Random Sampling & Replacement

Random Sampling with Replacement:
- Each selected individual is returned to the population before the next selection.
- This ensures the population size and probabilities remain constant across selections.
- Example: In a population of 16, the probability of selecting a specific person is \frac{1}{16}. After selecting and replacing them, the probability remains \frac{1}{16} for the next draw.
Random Sampling without Replacement:
- Each selected individual is removed from the population.
- This changes the population size and probabilities for subsequent selections.
- Example: In a population of 16, if one person is selected and removed, the probability of selecting a specific person in the next draw becomes \frac{1}{15}.

Normal Distribution

Characteristics:
- Symmetrical, bell-shaped distribution.
- Mean, median, and mode are equal and located at the center.
Empirical Rule (68-95-99.7 Rule):
- Approximately 68% of the scores fall within ±1 standard deviation (SD) of the mean.
- Approximately 95% of the scores fall within ±2 SDs of the mean.
- Approximately 99.7% of the scores fall within ±3 SDs of the mean.

Probability and Normal Distribution

Example Problem 1:
- Average IQ score is 100, with a standard deviation of 15.
- Probability of randomly selecting someone with an IQ of 100 or greater?
- Since 100 is the mean, the probability is 50% (as half the distribution lies above the mean in a normal distribution).
Example Problem 2:
- What is the probability of selecting someone with an IQ of 130 or greater?
- 130 is two standard deviations above the mean (100 + 2*15 = 130).
- The probability is 2.15% + 0.13% = 2.28% (the area in the tail beyond +2 SDs).
Example Problem 3:
- What is the probability of selecting someone with an IQ of 85 or less?
- 85 is one standard deviation below the mean (100 - 15 = 85).
- The probability is 13.59% + 2.15% + 0.13% = 15.87% (the area in the tail below -1 SD).
Example Problem 4:
- What is the probability of selecting someone with an IQ score between 70 and 115?
- 70 is two standard deviations below the mean and 115 is one standard deviation above the mean.
- The probability is 34.13% + 34.13% + 13.59% = 81.85%.
Example Problem 5:
- What is the probability of selecting someone with a z-score of 1 or greater?
- This is the area in the tail beyond +1 SD, which is 13.59% + 2.15% + 0.13% = 15.87%.
Example Problem 6:
- What is the probability of selecting someone with a z-score between -1 and -3?
- The probability is 13.59% + 2.15% = 15.74%.
Example Problem 7:
- What is the probability of selecting someone with a raw score of 111 or greater?
- This requires a more precise method than using the 68-95-99.7 rule, necessitating the use of a z-table.

Z-Tables

Purpose: Z-tables provide the proportion of area under the normal curve for different z-scores.
Structure:
- Column A: z-score.
- Column B: Proportion in the body (area to the left of the z-score).
- Column C: Proportion in the tail (area to the right of the z-score).
- Column D: Proportion between the mean and the z-score.
Symmetry: The normal distribution is symmetrical, so z-table values apply to both positive and negative z-scores.
Example:
- What is the probability of selecting someone with a z-score of 1.17 or greater?
- Using the z-table, the proportion in the tail for z = 1.17 is 0.1210, or 12.10%.
Example:
- What is the probability of selecting someone with a z-score of -2.59 or less?
- Using the z-table, the proportion in the tail for z = 2.59 is 0.0048, or 0.48%.
Example:
- What is the probability of selecting someone with a z-score between the mean and 0.44?
- Using the z-table, the proportion between the mean and z = 0.44 is 0.1700, or 17%.
Example:
- What is the probability of selecting someone with a z-score between 1.55 and 1.70?
- Proportion between mean and 1.70 is 0.4554.
- Proportion between mean and 1.55 is 0.4394.
- The probability is 0.4554 - 0.4394 = 0.016, or 1.6%.

Finding Z-Scores from Proportions

Example 1:
- What z-score separates the top 25% of the distribution?
- We are looking for the z-score that has 25% in the tail.
- Looking at the z-table for a tail proportion of approximately 0.25, the z-score is ~0.675.
Example 2:
- What z-score separates the bottom 5% of the distribution?
- We are looking for the z-score that has 5% in the left tail.
- Looking at the z-table for a tail proportion of approximately 0.05, the z-score is ~ -1.645.
Example 3:
- What z-scores separate the middle 40% of the distribution?
- This means 20% of the distribution is between the mean and the z-score on each side.
- Looking at the z-table for a proportion between mean and z of 0.20, the z-scores are approximately -0.525 and 0.525.

Using Raw Scores

Transforming Raw Scores to Proportions:
- Convert raw scores to z-scores using the formula: z = \frac{X - \mu}{\sigma}, where X is the raw score, μ is the population mean, and σ is the population standard deviation.
- Use the z-score to find corresponding proportions from the z-table.
Transforming proportions into raw scores:
- Convert Proportions to Z-scores.
- Then transform Z-scores into Raw Scores by using: X = \mu + (z * \sigma)

Inferential Statistics

Definition: Using sample statistics to draw conclusions about population parameters.
Example Research Question:
- Does a social anxiety treatment significantly change individuals’ social anxiety levels?
- Social anxiety is measured before and after treatment.
Need for an Objective Rule:
- To determine whether observed changes are due to the treatment or to chance (sampling error, measurement error, etc.)
- Establish a criterion for statistical significance.
Objective Rule:
- If there is less than a 5% probability that the resulting statistics would have been obtained from the original population, we can conclude that the treatment has a significant effect.
- This corresponds to a significance level of α = 0.05.
- In terms of z-scores, this means that if the resulting z-score is greater than 1.96 or less than -1.96, we can conclude that the social anxiety treatment significantly changed participants’ social anxiety levels.