Probability and Statistics Exam Notes

Probability & Venn Diagrams

Introduction & Exam Information

Today's session covers makeup time from last week, concluding slides 4.2 and beginning 4.3.
An exam is scheduled for Wednesday at 04:15 PM, covering Chapters 1 to 3.
Students with accommodations have been contacted regarding arrangements.
A specific calculator, available from 'red random shots' for approximately 15, is required.

Venn Diagrams: Visualization of Probability

Venn diagrams are used to visualize probability scenarios, starting from slide 32 of 4.2.
Sample Space (\mathcal{E}): Represents all possible outcomes. Other notations include S, U (capital U with top line), or \Omega. It's crucial to be aware of these different symbols when encountering external resources.
Events (A, B): Any outcomes or set of outcomes that can occur within the sample space.
Intersection (A \cap B): Denotes the event where both A and B happen simultaneously (read as "A and B").
Union (A \cup B): Denotes the event where A happens, or B happens, or both happen (read as "A or B").
Independent Events: Two events A and B are independent if the occurrence of one does not affect the probability of the other. The probability of their intersection is the product of their individual probabilities: P(A \cap B) = P(A) \times P(B).
Mutually Exclusive Events: Two events A and B are mutually exclusive if they cannot happen at the same time.
- The probability of their union is the sum of their individual probabilities: P(A \cup B) = P(A) + P(B).
- Their intersection is the empty set (\emptyset) or has a probability of zero: P(A \cap B) = 0.
A without B (A \setminus B): Represents the portion of A that does not include B. This is expressed as "A less B" or "A without B".
Complement of A (A', A^c, \bar{A}): Represents all outcomes in the sample space where event A does not occur. The probability of A' is P(A') = 1 - P(A).

Deck of Cards Example

A standard deck has 52 cards, comprising four suits (diamonds, spades, hearts, clubs), with 13 cards per suit (Ace, 2-10, Jack, Queen, King).
The probability of picking any single particular card is P(\text{card}) = \frac{1}{52}.
Example: Picking an Ace (A) or a Diamond (D)
- First, determine the number of elements in each section of the Venn diagram (n(\cdot)).
  - n(A \cap D) (Ace of Diamonds): 1
  - n(A \setminus D) (Aces that are not Diamonds): 3
  - n(D \setminus A) (Diamonds that are not Aces): 12
  - n(\text{Neither A nor D}): 52 - (3 + 1 + 12) = 36
- Probabilities using the diagram and the equally likely outcomes principle (favorable outcomes / total outcomes):
  - P(A \cap D) = \frac{1}{52}
  - P(A \cup D) = \frac{n(A \cup D)}{n(\mathcal{E})} = \frac{3 + 1 + 12}{52} = \frac{16}{52}
  - P(A') = \frac{n(A')}{n(\mathcal{E})} = \frac{12 + 36}{52} = \frac{48}{52} (Alternatively, 1 - P(A) = 1 - \frac{4}{52} = \frac{48}{52})
  - P(D \setminus A) = \frac{12}{52}

Conditional Probability

Conditional Probability (P(A|B)): The probability of event A occurring, given that event B has already occurred.
- Formula: P(A|B) = \frac{P(A \cap B)}{P(B)}.
- From a Venn diagram, this means restricting the sample space to only the outcomes in B and then finding the proportion of A within that reduced space: P(A|B) = \frac{n(A \cap B)}{n(B)}.
Multiplication Rule (derived from conditional probability): P(A \cap B) = P(A|B) \times P(B) or P(A \cap B) = P(B|A) \times P(A).

Practice Problems & Concepts

General Probabilities with Venn Diagrams

Always start by filling in the intersection when constructing a Venn diagram with probabilities to simplify calculations.
Example 1: Probabilities (P(A)=0.5, P(B)=0.2, P(A \cap B)=0.1)
- Venn Diagram sections: P(A \cap B) = 0.1, P(A \setminus B) = P(A) - P(A \cap B) = 0.5 - 0.1 = 0.4, P(B \setminus A) = P(B) - P(A \cap B) = 0.2 - 0.1 = 0.1, P(\text{Neither}) = 1 - (0.4 + 0.1 + 0.1) = 0.4.
- P(A \cup B) = P(A \setminus B) + P(B \setminus A) + P(A \cap B) = 0.4 + 0.1 + 0.1 = 0.6 (or using the addition rule: P(A) + P(B) - P(A \cap B) = 0.5 + 0.2 - 0.1 = 0.6).
- P(B') = P(A \setminus B) + P(\text{Neither}) = 0.4 + 0.4 = 0.8.
- P(A \cap B') = P(A \setminus B) = 0.4.
- P(A \cup B') = P(A \setminus B) + P(A \cap B) + P(\text{Neither}) = 0.4 + 0.1 + 0.4 = 0.9.

Conditional Probabilities with Venn Diagrams

When solving conditional probability problems using Venn diagrams, first reduce your sample space to only the 'given' event. Then, find the probability of the 'wanted' event within that reduced space.
Example 1: (P(A)=0.55, P(B)=0.4, P(A \cap B)=0.15)
- Venn diagram sections: P(A \cap B) = 0.15, P(A \setminus B) = 0.4, P(B \setminus A) = 0.25, P(\text{Neither}) = 0.2.
- P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{0.15}{0.4} = \frac{3}{8}. (Diagram: 0.15 / (0.15 + 0.25), restricted to B circle).
- P(B|A \cup B) = \frac{P(B \cap (A \cup B))}{P(A \cup B)} = \frac{P(B)}{P(A \cup B)} = \frac{0.4}{0.8} = \frac{1}{2} (Diagram: P(B) / (P(A \setminus B) + P(A \cap B) + P(B \setminus A)) = (0.15+0.25)/(0.4+0.15+0.25) ).
- P(A'|B') = \frac{P(A' \cap B')}{P(B')} = \frac{P(\text{Neither})}{P(A \setminus B) + P(\text{Neither})} = \frac{0.2}{0.4 + 0.2} = \frac{0.2}{0.6} = \frac{1}{3}. (Diagram: Restricted to B' (0.4 + 0.2), then A' within that (0.2)).

Independent vs. Mutually Exclusive Check Example

Independence: Check if P(X|Y) = P(X) (or P(X \cap Y) = P(X)P(Y)). If probabilities change when one event is given, they are dependent.
Mutually Exclusive: Check if P(X \cap Y) = 0. If there's an intersection (P(X \cap Y) > 0), they are not mutually exclusive.

Rearranging Probability Formulas

If events are dependent or mutual exclusivity is unknown, use the general addition rule: P(A \cup B) = P(A) + P(B) - P(A \cap B).
This formula can be rearranged to find an unknown component, e.g., P(A \cap B) = P(A) + P(B) - P(A \cup B).
- Example: If P(A) = 0.6, P(B) = 0.7, P(A \cup B) = 0.9.
  - P(A \cap B) = 0.6 + 0.7 - 0.9 = 0.4.

Exam Revision: Chapters 1-3

Types of Data

Nominal: Categorical data with no inherent order (e.g., gender, colors, names). No numerical operations are meaningful.
Ordinal: Categorical data with a meaningful order, but the intervals between categories are not uniform or meaningful (e.g., student grades, restaurant star ratings). You can rank, but differences aren't equal.
Interval: Numerical data where differences between values are meaningful, but there is no true or meaningful zero point. Ratios are not meaningful (e.g., temperature in Celsius or Fahrenheit, where 0 degrees doesn't mean absence of temperature, and negative values exist)..
Ratio: Numerical data with a meaningful zero point, allowing for meaningful ratios between values (e.g., height, weight, age, counts). A value of 0 represents the complete absence of the measured quantity.
- Example: Average temperatures are interval data because 0 degrees Celsius does not represent an absence of temperature, and negative values are possible, making ratios meaningless.

Descriptive vs. Inferential Statistics

Descriptive Statistics: Summarizes and describes the features of a dataset (e.g., mean, median, mode, histograms). It focuses on presenting what is observed.
- Example: Creating a histogram to describe a sample is descriptive.
Inferential Statistics: Uses data from a sample to make predictions or inferences about a larger population. This typically involves hypothesis testing and confidence intervals.

Sampling Methods

Stratified Sampling: Dividing the population into homogeneous subgroups (strata) based on shared characteristics (e.g., age, major) and then sampling from each stratum.
Systematic Sampling: Selecting subjects from a list at a regular interval (e.g., every n^{th} person after a random start).
Convenience Sampling: Selecting individuals who are easily accessible or readily available. This method is often biased and not representative.
Multi-stage Sampling: A complex sampling method that involves multiple stages of sampling, often used in large-scale surveys. It might involve sampling regions, then communities within regions, then households within communities.
Cluster Sampling: Dividing the population into naturally occurring groups (clusters) and then randomly selecting some clusters. All members within the chosen clusters are typically included in the sample.
- Example: Accessing a university register and selecting every n^{th} student is systematic sampling.
Simple Random Sample: Every member of the population has an equal chance of being selected. This requires a complete list of the population (sampling frame).
- Example: Randomly choosing 30 names from each state's electoral register is not a simple random sample of US adults because adults in smaller population states have a higher probability of being selected compared to larger population states.
Sampling Frame: The actual list or population from which the sample is drawn (e.g., electoral registers).

Lottery Probability

In a lottery where players choose 6 numbers from 1 to 42, every combination of 6 numbers has an equal chance of being chosen. Therefore, choosing specific numbers or choosing numbers randomly yields the same probability of winning.

Data Visualization & Measures

Frequency Table Components:
- Class Boundaries: Define the inclusive range for each class. If classes like 10-20, 20-30 are used, the boundaries are the same as the limits. If classes are like 10-19, 20-29, boundaries are halfway points (e.g., 19.5, 29.5).
- Midpoint: The average of the lower and upper limits of a class (\frac{\text{Lower + Upper}}{2}).
- Cumulative Frequency: A running total of frequencies, showing the number of observations up to and including a particular class.
- Relative Frequency: The proportion of observations in each class (\frac{\text{Class Frequency}}{\text{Total Frequency}}).
Ogive: A line graph that displays the cumulative frequency or cumulative relative frequency against the upper class boundaries. It helps visualize the shape of the cumulative distribution.
- Skew from Ogive: A right-skewed (positive skew) distribution has a heavier bottom tail on the ogive, indicating more lower values. A left-skewed (negative skew) distribution has a heavier upper tail, indicating more higher values. A symmetric distribution forms an S-shape.
Limitations of Data Tables: While providing detail, large tables can lead to information overload and make it difficult to visualize patterns. Having too few classes in a frequency table can obscure the true shape of the data.
Advantages/Disadvantages of Tables vs. Charts:
- Tables: Offer more detailed information but can be hard to visualize patterns.
- Charts: Make patterns easy to see but may lose some specific detail.
Pie Chart vs. Bar Chart: A pie chart is preferable for visualizing parts of a whole, specifically percentages or proportions of a total. Bar charts are better for comparing different categories or displaying exact frequencies/values.

Measures of Central Tendency & Spread for Sample Data (1, 4, 4, 9, 10, 13, 15, 20, 21, 30)

Sample Mean (\bar{x}) (N=10): Sum of all values divided by the number of values.
- \bar{x} = \frac{1 + 4 + 4 + 9 + 10 + 13 + 15 + 20 + 21 + 30}{10} = \frac{127}{10} = 12.7.
- Note: The lecturer used a hypothetical symmetrically spaced dataset (e.g., 1,3,5,7,9) to demonstrate explaining the mean of 5 without calculation by observing symmetry around the center.
Mode: The value that appears most frequently in the dataset.
- For the given data, the mode is 4.
- Note: If all values appear with the same frequency (e.g., all unique), there is no mode.
Median (Q_2): The middle value when the data is ordered. For an even number of data points, it's the average of the two middle values.
- Ordered: 1, 4, 4, 9, \mathbf{10}, \mathbf{13}, 15, 20, 21, 30.
- Median = \frac{10 + 13}{2} = 11.5.
Range: The difference between the highest and lowest values.
- Range = 30 - 1 = 29.
Interquartile Range (IQR): The range of the middle 50\% of the data, calculated as the difference between the third quartile (Q3) and the first quartile (Q1).
- \text{First Half (for } Q1\text{)}: 1, 4, 4, 9, 10. Q1 = 4.
- \text{Second Half (for } Q3\text{)}: 13, 15, 20, 21, 30. Q3 = 20.
- IQR = Q3 - Q1 = 20 - 4 = 16.
IQR vs. Range: The IQR is generally a better measure of spread than the range because it is less affected by outliers (extremely large or small values) and focuses on the spread of the typical, central values.
Sample Variance (s^2): Measures the average of the squared differences from the mean.
- Formula: s^2 = \frac{\sum x^2 - n\bar{x}^2}{n-1} (for sample variance).
- Units of Variance: If the original data units are in meters, the variance will be in meters squared (m^2), which is not intuitive.
Standard Deviation (s): The square root of the variance (s = \sqrt{s^2}). It is generally preferred over variance because it has the same units as the original data, making it easier to interpret.