Exam 3 - Chapters
Chapter 6
6.1 - Identifying and Estimating the Target Parameter
What is the goal of chapter?
estimate the value of an unknown population parameter, such as a population mean or a proportion from a binomial population
True or false?
different techniques are used for estimating a mean or proportion, depending on whether a sample contains a large or small number of measurements
What is the target parameter?
the population parameter of interest - unknown mean or proportion
denoted by theta - \theta
Determining the Target Parameter
With quantitative data, you are likely to be estimating the mean or variance of the data
With qualitative data, specifically with two outcomes, the binomial proportion of successes is likely to be the parameter of interest.
What is the point estimator?
formula definition: a rule or formula that tells us how to use the sample data to calculate a single number that can be used as an estimate of the target parameter
a single number calculated from the sample that estimates a target population parameter
ex: using the sample mean, x̄, to estimate the population mean \mu, thus x̄ is the point estimator
What can be used to attach a measure of reliability to an estimate?
obtaining an interval estimator - a range of numbers that contain the target parameter with a high degree of confidence
formula definition: an interval estimator - a formula that tells us how to use the sample data to calculate an interval that estimates the target parameter
What is another term for interval estimator?
a confidence interval
6.2 - Confidence Interval for a Population Mean: Normal (z) Statistics
According to the CLT, the sampling distribution of the sample mean is approximately normal for large samples
x̄\pm1.96\sigma_{x̄}=x̄\pm1.96\left(\frac{\sigma}{\sqrt{n}}\right)
If n\ge30, CLT and the normal (z) statistics can be used to determine the form of the sampling distribution of x̄
We are not certain that a true mean exists in an interval, but we can be confident if it is in the interval x̄\pm1.96\sigma_{x̄}=x̄\pm1.96\left(\frac{\sigma}{\sqrt{n}}\right)
95% sure that the interval contains \mu
confidence coefficient: .95
confidence level: 95%
What is the confidence coefficient?
the probability that a randomly selected confidence interval encloses the population parameter
What is the confidence level?
confidence coefficient as a percentage
If our confidence level is 95%, then in the long run, 95% of our confidence intervals will contain \mu and 5% will not
If you choose a confidence coefficient instead of .95,like .99, the confidence coefficient .99 - .01, the .01 is alpha and split between the tail ends. \frac{\alpha}{2}
(1-\alpha) = x̄\pm\left(\frac{\left(Z_{\alpha}\right)}{2}\right)\sigma_{x̄}

The value \alpha=P\left(Z>Z_{\alpha}\right)
Z_{\alpha} is the value of the standard normal random variable - z such that the area \alpha will lie to its right
Commonly Used Values

Confidence Intervals are preferred to point estimators, as they have more reliability
6.3 - Confidence Interval for a Population Mean: Student’s t-statistic
What are the two issues with usage of a small sample in making an inference about \mu ?
The shape of the sampling distribution is no longer assumed normal by CLT
solution: as long as the sampled population is normal, the sampling distribution is as well
The population standard deviation \sigma is almost always unknown.
so, instead of using z-score, we use the t-statistic
t=\frac{x̄-\mu}{\frac{s}{\sqrt{n}}}
The actual amount of variability in the sampling distribution of t depends on the sample size n, so to express this we say t-statistic has (n-1) degrees of freedom (df).
the smaller the number of degrees of freedom associated with the t-statistic, the more variable will be its sampling distribution
The t-value increases as the df decreases


What are the conditions required for a valid small-sample confidence interval for \mu?
A random sample is selected from the target population
The population has a relative frequency distribution that is approximately normal
6.4 - Large-Sample Confidence Interval for a Population Proportion
What are the properties of the sampling distribution of \^{p} ?
The mean of the sampling distribution is p
The standard deviation of \^{p} is \sqrt{\frac{pq}{n}}
\sigma_{\^{p}} = \sqrt{\frac{pq}{n}}
q = 1-p
For large sample, the sampling distribution of \^{p} is approximately normal, as long as n\^{p}\ge15 and n\^{q}\ge15
Formula for Large-Sample CI for \^{p}?
\^{p}\pm Z_{\frac{\alpha}{2}}\sqrt{}\frac{pq}{n}
What conditions are required for a valid large-sample CI for p?
a random sample is selected from the target population
the sample size n is large
np > 15
nq > 15
What is the Wilson CI for a population proportion?
p̃\pm z_{\frac{\alpha}{2}}\sqrt{\frac{p\left(̃1-p̃\right)}{n+4}}
use with extremely large sample size
6.5 - Determining the Sample Size
Estimating a Population Mean
To estimate a population mean, z_{\frac{\alpha}{2}}\left(\frac{\sigma}{\sqrt{n}}\right)=SE
then, n = \left\lbrack\frac{\left(z_{\frac{\alpha}{2}}\right)\sigma}{SE}\right\rbrack^2
Find standard deviation by using s or R/4
Estimating a Population Proportion
To find the sampling error,
z_{\frac{\alpha}{2}}\sqrt{\frac{pq}{n}}=SE
then, n = \frac{\left(z_{\frac{\alpha}{2}}\right)^2\left)\left(pq\right)\right.}{\left(SE\right)^2}
6.6 - Simple Random Sampling
Finite Population Correction for Simple Random Sampling
What is the simple random sampling with finite population of size N?
Estimation of the Population Mean
Estimated standard error:
\sigma_{(x̄)}=\frac{s}{\sqrt{n}}\sqrt{\left(\frac{N-n}{N}\right)}
approximate 95% CI x̄\pm2\sigma_{x̄}
Estimation of Population Proportion
Estimated standard error:
\sigma_{p̂}=\sqrt{\left(\frac{p̂\left(1-p̂\right)}{n}\right)}\sqrt{\frac{N-n}{N}}
approximate 95% CI p̂\pm2\sigma_{p̂}
6.7 - Confidence Interval for a Population Variance
What is the 100(1-\alpha) Confidence Interval for \sigma^2
\frac{\left(n-1\right)s^2}{X^2\frac{\alpha}{2}}\le\sigma^2\le\frac{\left(n-1\right)s^2}{x^2\left(1-\frac{\alpha}{2}\right)}
What are the conditions required for a valid CI for \sigma^2?
a random sample is selected from the target population
the population of interest has a relative frequency distribution that is approximately normal
Chapter 7
7.1 - The Elements of a Test of Hypothesis
What is a statistical hypothesis?
a statement about the numerical value of a population parameter
What are the two hypotheses?
null hypothesis: represents the status quo to the party performing the sampling experiment
alternative/research hypothesis: which will be accepted only if the data provide convincing evidence of its truth
H_0=,\ge,\le
H_{a}\ne,<,>
The null hypothesis has to be proven false
The alternative hypothesis has to be proven to be true
If the population size is not large enough to use CLT, we have to compute a test statistic to find the hypothesis
Z=\frac{x̄-\mu}{\sigma_{x̄}}
Test statistic: a sample statistic to decide between the null and alternative hypotheses: decides if rejecting the null hypothesis occurs
What is a Type 1 Error?
the researcher rejects the null hypothesis in favor of the alternative hypothesis when null hypothesis is actually true. Probability of this occurring is denoted by \alpha
What is the rejection region?
the set of possible values of the test statistic for which the researcher will reject the null hypothesis in favor of the alternative hypothesis
What is a Type 2 Error?
the researcher failing to reject (accepts) the null hypothesis when it should be rejected
denoted by \beta
Conclusions and Consequences for a test of Hypothesis

Be careful not to accept H_0, as the measure of reliability = \beta=P\left(TypeII\right) is almost always unknown, so we say fail to reject H_0 instead
7.2 - Formulating Hypotheses and Setting up the Rejection Range
Forms of alternative hypothesis
One tailed, upper tailed
H_{\alpha}:\mu>2,400$
One tailed, lower tailed
H_{\alpha}:\mu<2,400

Two-tailed
H_{\alpha}:\mu\ne2,400

From now on, null hypothesis will always be set to an equal sign
Key Words for upper-tailed:
greater than, larger, above
Key Words for lower-tailed:
less than, smaller, below
Key Words for two-tailed:
not equal to, differs from
7.3 - Observed Significant Levels: p-Values
What is the p-value?
probability of observing a value of the test statistic that is at least as contradictory to the null hypothesis and supports the alternative hypothesis
If a p-value is upper tailed, take .5 - t-statistic
If a p-value is lower tailed, just use t-statistic
If a p-value is two-tailed, take t-statistic/2

If a p-value is less than alpha, we generally reject the null hypothesis
If a p-value is greater than alpha, we generally fail to reject the null hypothesis
7.4 - Test of Hypothesis About a Population Mean: Normal (z) statistic
The test statistic we use depends on the sample size
If large, n > 30
If small, n < 30
Or if value of population standard deviation is unknown
Large Sample Size
CLT guarantees that the sampling distribution will be normal, so z-statistic is used.

Conditions Required for a Valid Large-Sample Hypothesis Test
1. A random sample is selected from the target population
2. The sample size n is large (n>30), guaranteeing it will be normal
What are the possible conclusions for a test of hypothesis?
If the calculated test statistics falls in the rejection region, alpha > p-value, reject Ho and conclude Ha is true.
If the test statistic does not fall in the rejection region, conclude that the sampling experiment does not provide sufficient evidence to reject Ho at the level of significance
7.5 - Test of Hypothesis About a Population Mean: T-statistic
When a sample is small, we use t-statistic:
Formula:


What are the conditions required for a valid small-sample hypothesis test?
A random sample is selected from the target population
The population from which the sample is selected has a distribution that is normal
Small-sample inferences typically require more assumptions and provide less information about the population parameter than large-sample inferences.
7.6 - Large-Sample Test of Hypothesis About a Population Parameter
Inferences about population proportions (or percentages) are often made in the context of the probability, p, of “success” for a binomial distribution

What are the conditions required for a valid large-sample hypothesis test for p?
A random sample is selected from a binomial population
The sample size n is large enough so that both np > 15 and nq > 15
Sample Samples
Tests for a population proportion based on the z-statistic may not be valid - especially when conducting one-tailed tests, which is why we use the exact binomial tests for them
7.7 - Test of Hypothesis About a Population Variance

What are the conditions required for a valid hypothesis test for variance?
A random sample is selected from the target population
The population from which the sample is selected has a distribution that is approximately normal
7.8 - Calculating Type II Error Probabilities
Steps for Calculating Beta for a Large-Sample Test
Calculate values of xbar:
Convert the Xbaro to z-value:
Sketch the alternative distribution and share the area in the nonrejection region and use the z-statistics and Table II to find the B (shaded area)
The power of a test is the probability that the test will correctly lead to the rejection of the null hypothesis.
Power is equal to (1-B)

Summary of Chapter 7:
Key words for identifying the target parameter
μ - mean, average
p - proportion, fraction, percentage, rate, probability
σ² - variance, variability, spread
Elements of a Hypothesis Test
Null Hypothesis (H0): A statement that there is no effect or no difference, used as a starting point for statistical testing.
Alternative Hypothesis (H1 or Ha): The statement that indicates the presence of an effect or a difference, which researchers aim to support.
Significance Level (α): The threshold for rejecting the null hypothesis, commonly set at 0.05 or 0.01.
Test Statistic: A standardized value calculated from sample data during a hypothesis test.
p-value
conclusion
Probabilities in Hypothesis Testing
a = P(Type I Error) = P(reject Ho when Ho is true)
B = P(Type II Error) = P(accept Ho when Ho is false)
1 - B = Power of a Test = P(reject Ho when Ho is false)
Forms of Alternative Hypothesis
Lower-tailed: Ha: μ < μo
Upper-tailed: Ha: μ > μo
Two-tailed: Ha: μ \ne $$ μo
Using p-Values to Make Conclusions
1. Choose significance level (a)
2. Obtain p-value of the test
3. If a > p-value, reject Ho
Guide to selecting a one-sample hypothesis:
Chapter 8
8.1 - Identifying the Target Parameter
Determining the Target Parameter:
8.2 - Comparing Two Population Means: Independent Sampling
In the large-sample case, we use the z-statistic, while in the small-sample case we use the t-statistic
Properties to Know:
Large, Independent Sample Information:
8.3 - Compaing Two Population Means: Paired Difference
Utilizes test statistic for “one sample”

Paired difference experiments are generally more accurate than independent sample experiments
A paired difference experiment is never obtained by pairing the sample observations after the measurements have been acquired
8.4 - Compaing Two Population Proportions: Independent Sampling
8.5 - Determining the Required Size
To estimate (u1-u2), we use the equal sample size case:
To estimate (p1-p2), we use this version of the equal sample size case:
8.6 - F-tests
Use to compare variances between populations

What conditions are required?
Normally distributed
Random sampling













