Lecture 6: Probability: Normal Distribution, Distribution of Sample vs. Sampling Distribution of a Statistic, and Intro to Statistical Inference (2.10.25)

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/13

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

14 Terms

New cards

Normal Distribution

Model for continuous variable
Absolute likelihood (probability) of a single value is 0. I.e. P(x) = 0
Completely characterized by its mean and standard deviation
Mean = Median = Mode. P(x > μ) = P(x < μ) = 0.5
P(a < x < b) = the area under the normal density curve from a to b. 68%-95%-99% rule.
For any probability distribution, total area under the density curve is 1

New cards

Computing probabilities about normal distributions:

First standardize or convert a problem about a normal distribution X into a problem about the standard normal distribution Z using Z = X - μ / 𝜎
Then use the Z-table to compute the desired probability

New cards

Sampling Distributions

Statistical inference involved making inferences or generalizations about population parameters based on observed sample statistics
The probability distribution of a statistic produced by repeatedly selecting samples of the same size and computing the desired statistic, e.g., sampling distribution of the sample mean
The collection of all possible sample means is called the sampling distribution of the sample means
Compare the population mean and standard deviation

New cards

Statistical inference involved making inferences or generalizations about population parameters based on observed sample statistics

The mean/proportion of a representative sample is a very good estimate of the unknown population mean/proportion (accuracy)
When we make estimates about population parameters based on sample statistics, it is extremely important to quantify the precision in our estimates
To quantify the precision of the estimation, recall the context of replicated or repeated study/sampling

New cards

Compare the population mean and standard deviation

On average, the sample mean is equal to the population mean
Variability in the sample means is much smaller than the variability in the population

New cards

Central Limit Theorem

Suppose we have a population with known mean μ and standard deviation 𝜎
Theoretically, if we take simple random samples of size n with replacement, then for sufficiently large n (usually n ≥ 30), the sampling distribution of the sample means is approximately normal
Central Limit Theorem: For a random sample of size n from a population having mean μ and standard deviation 𝜎, as the sample size increases, the sampling distribution of the sample mean x̄ approaches an approximately normal distribution with mean and standard deviation as μ_x = μ; 𝜎x̄ = 𝜎 / √n
Non-normal population
Take samples of size n, as long as n is sufficiently large
The distribution of the sample mean is approximately normal, therefore can use Z-scores to compute probabilities
We can then quantify the variability/uncertainty in the sample mean (estimate of the population mean)
Rule of thumb used is 30 but be careful

New cards

Theoretically, if we take simple random samples of size n with replacement, then for sufficiently large n (usually n ≥ 30), the sampling distribution of the sample means is approximately normal

This will hold true regardless of whether the source population is normal or skewed or even Binomial
In practice, with finite sample space, samples are typically drawn without replacement

New cards

The distribution of sample means is approximately…

Normal for large n

New cards

The mean of the sample means will always…

Be equal to the population mean μ_x = μ

New cards

The standard deviation of the sample means…

Defined as 𝜎x̄ = 𝜎 / √n and is also called the standard error

New cards

The standard error decreases as…

The sample size increases

New cards

The standard error decreases as the sample size increase

Variability in the sample means is smaller for larger sample sizes

New cards

Variability in the sample means is smaller for larger sample sizes

This is intuitively sensible as extreme values will have less impact in samples of larger size

New cards

Rule of thumb used is 30 but be careful

Note that if starting from normal population, then the result holds for samples of any size
If the outcome in the population is dichotomous (Binomial distribution), then another rule of thumb: min[np, n(1-p)] > 5, where n is the sample size and p is the probability of success in the population