1/74
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
p̂
the sample proportion
p̂= x/n (#successes/total # in sample)
What are the assumptions of that must hold for the sampling distribution of the sample proportion to be normally distributed?
np≥15 AND n(1-p)≥15
If our assumptions are met, what is the mean of the sampling distribution of the sample proportion?
p (the population mean)
T/F: The sample proportion and sample mean are random variables.
True.
How can you distinguish question about the sampling distribution of the sample proportion from a question about the sampling distribution of the sample mean?
will often include the word "proportion," a percentage, and a deal with with categorical values
If you know you are dealing with a sampling distribution question and you see that the standard deviation is given, what can you assume?
You can assume that you are dealing with a question about the sampling distribution of the sample mean and NOT the sampling distribution of the sample proportion.
Describe the sampling distribution of the sample proportion, assuming np≥15 and n(1-p)≥15.
p̂~N(p, √(p(1-p)/n))
T/F: If np≤15 and n(1-p)≤15, you can add 2 successes and 2 failures to the approximate sampling distribution for the sample proportion.
FALSE. This "trick" ONLY works with confidence intervals. If np≤15 and n(1-p)≤15, you cannot work with the sampling distribution of the sample proportion because you cannot assume that it is normal.
x̄
The sample mean, which is the average of the observations in our sample.
What are the assumptions that must hold true for the sampling distribution of the sample mean to be normally distributed?
n≥30 or the original population was normally distributed
If the sampling distribution of the sample mean is normally distributed, what is its mean?
The population mean (μ)
Describe the sampling distribution of the sample mean, assuming n≥30 or the original population was normally distributed.
x̄~N(μ, σ/√n)
Central Limit Theorem
Even when the population's probability distribution is not bell shaped, the sampling distribution of the sample mean is bell-shaped when the same size is large enough (n≥30).
Population Distribution
The distribution for the overall population, which is based on parameters such as the population mean and the population standard deviation.
Data Distribution
The distribution of the data we collect in practice. We collect samples in order to create a data distribution so that we can estimate the population. The larger the sample size, the more the better the data distribution approximates the population distribution.
The data distribution looks more like the population distribution when the sample size is (low/high).
high
Sample means tend to cluster more around the population mean when the sample size is (low/high).
high
A _____ is a numerical summary of the data in a population, and a _____ is a numerical summary of the data in a sample.
parameter; statistic
Point Estimates
• Statistics that are used as "best guesses" for parameters
• Sample mean point estimate for population mean
• Sample stdev. estimate for population stdev.
• Sample proportion point estimate for population proportion
Interval Estimates
Describe a range over which we can reasonably confident that the population parameter lies.
Bias vs. Variability
Biased: sampling distribution not centered at the parameter of interest
Variability: sampling distribution has a large stderror and this highly spread out
Inferential Statistics
The branch of statistics that uses sample statistics to estimate population parameters.
Confidence Interval
Estimator ± Margin of Error
Estimator
x̄ (sample mean) to estimate μ (population mean)
p̂ (sample prop) to estimate p (population prop)
Confidence Interval for a Population Proportion
p̂ ± z(√p̂(1-p̂)/n)
Confidence Interval for a Population Mean
x̄ ± t(s/√n)
What is the margin of error in a confidence interval for a population proportion?
z(√p̂(1-p̂)/n)
What is the margin of error in a confidence interval for a population mean?
t(s/√n)
A sampling distribution is a probability distribution for a (parameter/statistic).
statistic
If our assumptions are met, what is the standard error of the sampling distribution of the sample proportion?
stderror= √(p(1-p)/n)
If the sampling distribution of the sample mean is normally distributed, what is its standard error?
stderror= σ/√n
What z-score is associated with a confidence level of 90%?
z=1.645
What z-score is associated with a confidence level of 95%?
z=1.96
What z-score is associated with a confidence level of 99%?
z=2.5786
T/F: If you are constructing a CI for a pop. prop. and you find that np≤15 or n(1-p)≤15, you can estimate the confidence interval by adding two successes and two failures.
True. This is the ONLY situation where this "trick" will work (does not work for CI for a pop. mean).
Student's t distribution
A distribution used to construct confidence intervals for the pop. mean when n is small and the pop. stdev. is unknown; similar to a normal dist. but wider and has fatter tails
Degrees of Freedom (df)
df= n-1
At large sample sizes, the t distribution...
approximates a normal dist., so we can use z-scores rather than t-scores
The t table gives you probabilities in the (left/right) tail of the distribution.
right
What are the three assumptions that must be met before we can construct a confidence interval for a population proportion?
1. The data comes from a simple random sample
2. np≥15
3. n(1-p)≥15
The margin of error for a CI _____ as the sample size INCREASES and _____ as the confidence level (CL) INCREASES.
decreases; increases
What are the assumptions that must be met before we can construct a confidence interval for a population mean?
1. The data comes from a simple random sample
2. n≥30 OR the original population distribution was normal
T/F: Confidence intervals for means are highly susceptible to the effects of outliers.
True. This is why it's a good idea to visualize the data using boxplots or dotplots if we are dealing with a small sample size. We have to ensure that there are no outliers or any other reason to think that the original population was not normally distributed.
T/F: Probability applies to statistics.
True.
T/F: Probability applies to parameters.
False. Probability applies to statistics, not parameters.
T/F: A 95% confidence interval suggests that, if we use it over and over again for various samples, we will make correct inferences 95% of the time in the long-run.
True.
T/F: Confidence intervals can be used to assess the probability of individual outcomes.
False. Confidence intervals are constructed to describe POPULATION means, not sample means or individual observations.
T/F: Confidence intervals can be used to assess the likelihood of getting a certain sample mean.
False. Confidence intervals are constructed to describe POPULATION means, not sample means or individual observations.
Minimum Sample Size to Achieve a Particular Confidence Level When Estimating a Population Proportion
n= z^2 • p̂(1-p̂)/m^2
Minimum Sample Size to Achieve a Particular Confidence Level When Estimating a Population Mean
n= (zs/m)^2
Suppose you are calculating the min. sample size for a study intended to estimate the pop. prop. and you have no idea what value you should use for the sample proportion. What value should you use?
p̂=0.5
When calculating the min. sample size needed for estimating a proportion or mean, you always round (down/up).
up
Significance Tests
With significance tests, we begin with a preconceived notion or a claim about the value of a parameter. We take a sample and use the results to determine whether the results support or do not support the claim.
What assumptions must be met when we are conducting a hypothesis test for a proportion?
1. We are dealing with a question about proportions, meaning we are working with categorical, yes/no data.
2. We have a simple random sample (SRS).
3. np0≥15 AND n(1-p0)≥15
H0
The null hypothesis, a statement that the parameter takes on a particular, assumed value.
Ha
The alternative hypothesis, a hypothesis that contradicts the null hypothesis in some way.
Test Statistic
z= (observed-#H0)/stderror
p0
The proportion assumed to be true in the null hypothesis
T/F: In hypothesis testing, H0 is always followed by an "equals" sign (=).
True.
T/F: In hypothesis testing, Ha is always followed by an "equals" sign (=).
False. In fact, it never is. The alternative hypothesis can be followed by <,>, or ≠
A hypothesis statement is a statement about a (statistic/parameter).
parameter
One-Sided Hypothesis Test
A hypothesis test in which the values for which we can reject the null (H0) are located entirely in one tail of the probability distribution.
The alternative hypothesis (Ha) has a < or > symbol.
Two-Sided Hypothesis Test
A hypothesis test in which the values for which we can reject the null (H0) are located entirely in both tails of the probability distribution.
The alternative hypothesis (Ha) has a ≠ symbol.
If the p-value is 0
very strong
If the p-value is 0.01
strong
If the p-value is 0.05
some
If the p-value is >0.1, we say that there is ______ evidence against the null hypothesis and for the alternate hypothesis.
no statistically significant
p-value
Represents the probability that the observed sample statistic value or an even more extreme sample statistic value would occur, assuming H0 is true.
What does a low p-value mean?
A low p-value means that it would be extremely unlikely for us to get the sample statistic we did if H0 were true. It is more likely that H0 is not true, so we should reject it in favor of the alternative hypothesis.
A (low/high) p-value means that there is evidence against H0 and for Ha.
low
T/F: If the p-value is high in a hypothesis test, we accept the null hypothesis.
False. We never accept the null hypothesis. We can only reject or fail to reject H0.
If the p-value ≤ α, we...
reject the H0 at α level of significance
If the p-value ≥ α, we...
fail to reject H0 at α level of significance
The result of a hypothesis test are significant when
p-value ≤ α
In hypothesis testing, what level of α corresponds with a 90% confidence level?
α= 0.10