Looks like no one added any tags here yet for you.
Normal Distribution
Model for continuous variable
Absolute likelihood (probability) of a single value is 0. I.e. P(x) = 0
Completely characterized by its mean and standard deviation
Mean = Median = Mode. P(x > μ) = P(x < μ) = 0.5
P(a < x < b) = the area under the normal density curve from a to b. 68%-95%-99% rule.
For any probability distribution, total area under the density curve is 1
Computing probabilities about normal distributions:
First standardize or convert a problem about a normal distribution X into a problem about the standard normal distribution Z using Z = X - μ / 𝜎
Then use the Z-table to compute the desired probability
Sampling Distributions
Statistical inference involved making inferences or generalizations about population parameters based on observed sample statistics
The probability distribution of a statistic produced by repeatedly selecting samples of the same size and computing the desired statistic, e.g., sampling distribution of the sample mean
The collection of all possible sample means is called the sampling distribution of the sample means
Compare the population mean and standard deviation
Statistical inference involved making inferences or generalizations about population parameters based on observed sample statistics
The mean/proportion of a representative sample is a very good estimate of the unknown population mean/proportion (accuracy)
When we make estimates about population parameters based on sample statistics, it is extremely important to quantify the precision in our estimates
To quantify the precision of the estimation, recall the context of replicated or repeated study/sampling
Compare the population mean and standard deviation
On average, the sample mean is equal to the population mean
Variability in the sample means is much smaller than the variability in the population
Central Limit Theorem
Suppose we have a population with known mean μ and standard deviation 𝜎
Theoretically, if we take simple random samples of size n with replacement, then for sufficiently large n (usually n ≥ 30), the sampling distribution of the sample means is approximately normal
Central Limit Theorem: For a random sample of size n from a population having mean μ and standard deviation 𝜎, as the sample size increases, the sampling distribution of the sample mean x̄ approaches an approximately normal distribution with mean and standard deviation as μx = μ; 𝜎x̄ = 𝜎 / √n
Non-normal population
Take samples of size n, as long as n is sufficiently large
The distribution of the sample mean is approximately normal, therefore can use Z-scores to compute probabilities
We can then quantify the variability/uncertainty in the sample mean (estimate of the population mean)
Rule of thumb used is 30 but be careful
Theoretically, if we take simple random samples of size n with replacement, then for sufficiently large n (usually n ≥ 30), the sampling distribution of the sample means is approximately normal
This will hold true regardless of whether the source population is normal or skewed or even Binomial
In practice, with finite sample space, samples are typically drawn without replacement
The distribution of sample means is approximately…
Normal for large n
The mean of the sample means will always…
Be equal to the population mean μx = μ
The standard deviation of the sample means…
Defined as 𝜎x̄ = 𝜎 / √n and is also called the standard error
The standard error decreases as…
The sample size increases
The standard error decreases as the sample size increase
Variability in the sample means is smaller for larger sample sizes
Variability in the sample means is smaller for larger sample sizes
This is intuitively sensible as extreme values will have less impact in samples of larger size
Rule of thumb used is 30 but be careful
Note that if starting from normal population, then the result holds for samples of any size
If the outcome in the population is dichotomous (Binomial distribution), then another rule of thumb: min[np, n(1-p)] > 5, where n is the sample size and p is the probability of success in the population