Looks like no one added any tags here yet for you.
Sample statistics (sample mean/ stdev) are…
Random variables
Take on certain values depending on the distribution of data in the sample
Sample variation
Concept that some samples may represent the population better than others
Inevitable
Sample means tend to be close to the population mean as we increase our sample size → taking more from the population = covering more area
Sampling distribution of the sample meanp
Probability distribution of all possible sample means from all possible samples
This distribution is normal → allows use of the empirical rule to describe what percentage of all sample means are within certain values
Central Limit Theorem (CLT)
For any population regardless of its distribution, as we increase the size of our samples, the samples’ means will be normally distributed (binomial, uniform, normal, etc.)
Using CLT to estimate
As long as we have a sufficiently large sample, (roughly 30+) allows us to use the sample mean in place of an unknown population mean and standard deviation equal to the standard error of the mean
Allows us to define an interval where the sample means are expected to fall as n is sufficiently large enough
Sampling error
Sample mean, etc. not represented by the population mean, etc. → can describe by standard deviation
Standard error of the mean
Population standard deviation/ square root of the number of observations (n)
Measure of uncertainty
Difference between standard error and standard deviation
Error: assess how far a sample statistic likely falls from the population parameter
quantify variability between samples drawn from the same population
Deviation: assess how far a particular data point is likely to fall from the mean
quantify variability of values in a dataset
When to use proportions?
When our random variable is nominal and has 2 mutually exclusive groups, rather than the mean, we look at proportion
Proportion
Fraction, ratio, or percentage that indicates what point of the sample/ population has a particular trait of interest
P = x/n
You can apply the CLT and say the distribution of a sample proportion is normal, given the sample size is sufficiently large. You can determine if the sample size is large enough by:
nπ ≥ 5 and nπ(1-π) ≥ 5
Not pi as in 3.14, proportion of the population