1/18
Slideshow
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Estimating the population mean requires
different formulas than proportions since the mean is calculated from a quantitative variable, whereas the proportion is calculated from a categorical variable, but the underlying principles are the same
The population (true) mean is called Miu, we estimate it based on
the sample aberage.
For a mean, instead of a normal / z curve we use
a t-curve
The collection of y(bars_ from many different samples is called
the sampling distribution of y bar, and is a t-distribution with degrees of freedon N-1
standard deviation can be found as
s / sqrt(n)
How does the t-curve differ from the normal curve?
It’s also bell shaped, but has longer tails than the z-curve. This is due to the uncertainty in not knowing the true Std. Dev, and results in wider confidence intervals.
confidence intervals when estimating the population mean
the critical value is = t* with DF = n-1
In r we find the cutoff value for 95% confidence using
qt (0.025, dof) or qt(0.975,df)
How do we interpret confidence intervals here?
There is 95% confidence that the true average deliver time for all deliveries is between 25.75 and 29.85 minutes
95% confidence here means
if we took a large number of random samples of size 50 under similar conditions, 95% of these samples which would produce confidence intervals contain the true average delivery time.
For this test, we use Ho: Miu = hypothesized value, and what are the Ha’s?
Ha : Miu > hypothesized value
Ha : Miu < hypothesized value
Ha : Miu =/ hypothesized value
Test statistic is given by :
(sample statistic - null hypothesis value (H0)) / SE
how to compute the p-value in R
using pt instead of pnorm, with DF = n-1, either 1-sided or 2- sided depending on Ha.
> and < mean the t test has how many sides?
one sided!!! because of this you use pt instaed of pnorm and enter the negative of the test statistic together with the Df
If we are doing a 2 sided t-test, what do we do?
use pt and enter the negative of the test statistic together with the df, then multiple by 2
2*pt(-TS, DF) this gives you the p-value
For tests of the mean, the same size should be
N is >= 30. Skewed data is okay if n>= 30. If the data follows a normal distribution, the sample size can be less than 30.
The sample size condition can be met in one of two ways (just one has to be true) : Either :
The sample size must be n>= 30 OR
The original data must follow a Normal distribution
If this condition is not met, the sampling distribution won’t be a t-curve and the P-value/CI willl be wrong.
THIS REPLACES THE CONDITION NP>= 10 FOR PROPORTIONS. THERE IS NO P FOR AVERAGES
R function for t test
t.test(data$variable, alternative = …, mu = …, conf.level = …)
Mu is
the value of the hypothesized mean (Ho being tested)