Lecture 6: Probability: Normal Distribution, Distribution of Sample vs. Sampling Distribution of a Statistic, and Intro to Statistical Inference (2.10.25)

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/13

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

14 Terms

1
New cards

Normal Distribution

  • Model for continuous variable

  • Absolute likelihood (probability) of a single value is 0. I.e. P(x) = 0

  • Completely characterized by its mean and standard deviation

  • Mean = Median = Mode. P(x > μ) = P(x < μ) = 0.5

  • P(a < x < b) = the area under the normal density curve from a to b. 68%-95%-99% rule.

  • For any probability distribution, total area under the density curve is 1

2
New cards

Computing probabilities about normal distributions:

  • First standardize or convert a problem about a normal distribution X into a problem about the standard normal distribution Z using Z = X - μ / 𝜎

  • Then use the Z-table to compute the desired probability

3
New cards

Sampling Distributions

  • Statistical inference involved making inferences or generalizations about population parameters based on observed sample statistics

  • The probability distribution of a statistic produced by repeatedly selecting samples of the same size and computing the desired statistic, e.g., sampling distribution of the sample mean

  • The collection of all possible sample means is called the sampling distribution of the sample means

  • Compare the population mean and standard deviation

4
New cards

Statistical inference involved making inferences or generalizations about population parameters based on observed sample statistics

  • The mean/proportion of a representative sample is a very good estimate of the unknown population mean/proportion (accuracy)

  • When we make estimates about population parameters based on sample statistics, it is extremely important to quantify the precision in our estimates

  • To quantify the precision of the estimation, recall the context of replicated or repeated study/sampling

5
New cards
  • Compare the population mean and standard deviation

  1. On average, the sample mean is equal to the population mean

  2. Variability in the sample means is much smaller than the variability in the population

6
New cards

Central Limit Theorem

  • Suppose we have a population with known mean μ and standard deviation 𝜎

  • Theoretically, if we take simple random samples of size n with replacement, then for sufficiently large n (usually n ≥ 30), the sampling distribution of the sample means is approximately normal

  • Central Limit Theorem: For a random sample of size n from a population having mean μ and standard deviation 𝜎, as the sample size increases, the sampling distribution of the sample mean approaches an approximately normal distribution with mean and standard deviation as μx = μ; 𝜎x̄ = 𝜎 / n

  • Non-normal population

  • Take samples of size n, as long as n is sufficiently large

  • The distribution of the sample mean is approximately normal, therefore can use Z-scores to compute probabilities

  • We can then quantify the variability/uncertainty in the sample mean (estimate of the population mean)

  • Rule of thumb used is 30 but be careful

7
New cards
  • Theoretically, if we take simple random samples of size n with replacement, then for sufficiently large n (usually n ≥ 30), the sampling distribution of the sample means is approximately normal

  • This will hold true regardless of whether the source population is normal or skewed or even Binomial

  • In practice, with finite sample space, samples are typically drawn without replacement

8
New cards

The distribution of sample means is approximately…

Normal for large n

9
New cards

The mean of the sample means will always…

Be equal to the population mean μx = μ

10
New cards

The standard deviation of the sample means

Defined as 𝜎x̄ = 𝜎 / n and is also called the standard error

11
New cards

The standard error decreases as…

The sample size increases

12
New cards

The standard error decreases as the sample size increase

Variability in the sample means is smaller for larger sample sizes

13
New cards

Variability in the sample means is smaller for larger sample sizes

This is intuitively sensible as extreme values will have less impact in samples of larger size

14
New cards
  • Rule of thumb used is 30 but be careful

  • Note that if starting from normal population, then the result holds for samples of any size

  • If the outcome in the population is dichotomous (Binomial distribution), then another rule of thumb: min[np, n(1-p)] > 5, where n is the sample size and p is the probability of success in the population