Exam 3 - Chapters

Chapter 6

6.1 - Identifying and Estimating the Target Parameter
  • What is the goal of chapter?

    • estimate the value of an unknown population parameter, such as a population mean or a proportion from a binomial population

  • True or false?

    • different techniques are used for estimating a mean or proportion, depending on whether a sample contains a large or small number of measurements

  • What is the target parameter?

    • the population parameter of interest - unknown mean or proportion

    • denoted by theta - \theta

  • Determining the Target Parameter

  • With quantitative data, you are likely to be estimating the mean or variance of the data

  • With qualitative data, specifically with two outcomes, the binomial proportion of successes is likely to be the parameter of interest.

  • What is the point estimator?

    • formula definition: a rule or formula that tells us how to use the sample data to calculate a single number that can be used as an estimate of the target parameter

    • a single number calculated from the sample that estimates a target population parameter

      • ex: using the sample mean, x̄, to estimate the population mean \mu, thus x̄ is the point estimator

  • What can be used to attach a measure of reliability to an estimate?

    • obtaining an interval estimator - a range of numbers that contain the target parameter with a high degree of confidence

    • formula definition: an interval estimator - a formula that tells us how to use the sample data to calculate an interval that estimates the target parameter

  • What is another term for interval estimator?

    • a confidence interval

6.2 - Confidence Interval for a Population Mean: Normal (z) Statistics
  • According to the CLT, the sampling distribution of the sample mean is approximately normal for large samples

    • x̄\pm1.96\sigma_{x̄}=x̄\pm1.96\left(\frac{\sigma}{\sqrt{n}}\right)

  • If n\ge30, CLT and the normal (z) statistics can be used to determine the form of the sampling distribution of

  • We are not certain that a true mean exists in an interval, but we can be confident if it is in the interval x̄\pm1.96\sigma_{x̄}=x̄\pm1.96\left(\frac{\sigma}{\sqrt{n}}\right)

    • 95% sure that the interval contains \mu

    • confidence coefficient: .95

    • confidence level: 95%

  • What is the confidence coefficient?

    • the probability that a randomly selected confidence interval encloses the population parameter

  • What is the confidence level?

    • confidence coefficient as a percentage

  • If our confidence level is 95%, then in the long run, 95% of our confidence intervals will contain \mu and 5% will not

  • If you choose a confidence coefficient instead of .95,like .99, the confidence coefficient .99 - .01, the .01 is alpha and split between the tail ends. \frac{\alpha}{2}

  • (1-\alpha) = x̄\pm\left(\frac{\left(Z_{\alpha}\right)}{2}\right)\sigma_{x̄}

  • The value \alpha=P\left(Z>Z_{\alpha}\right)

    • Z_{\alpha} is the value of the standard normal random variable - z such that the area \alpha will lie to its right

  • Commonly Used Values

  • Confidence Intervals are preferred to point estimators, as they have more reliability

6.3 - Confidence Interval for a Population Mean: Student’s t-statistic
  • What are the two issues with usage of a small sample in making an inference about \mu ?

    • The shape of the sampling distribution is no longer assumed normal by CLT

      • solution: as long as the sampled population is normal, the sampling distribution is as well

    • The population standard deviation \sigma is almost always unknown.

      • so, instead of using z-score, we use the t-statistic

      • t=\frac{x̄-\mu}{\frac{s}{\sqrt{n}}}

  • The actual amount of variability in the sampling distribution of t depends on the sample size n, so to express this we say t-statistic has (n-1) degrees of freedom (df).

    • the smaller the number of degrees of freedom associated with the t-statistic, the more variable will be its sampling distribution

  • The t-value increases as the df decreases

  • What are the conditions required for a valid small-sample confidence interval for \mu?

    • A random sample is selected from the target population

    • The population has a relative frequency distribution that is approximately normal

6.4 - Large-Sample Confidence Interval for a Population Proportion
  • What are the properties of the sampling distribution of \^{p} ?

    • The mean of the sampling distribution is p

    • The standard deviation of \^{p} is \sqrt{\frac{pq}{n}}

      • \sigma_{\^{p}} = \sqrt{\frac{pq}{n}}

      • q = 1-p

    • For large sample, the sampling distribution of \^{p} is approximately normal, as long as n\^{p}\ge15 and n\^{q}\ge15

  • Formula for Large-Sample CI for \^{p}?

    • \^{p}\pm Z_{\frac{\alpha}{2}}\sqrt{}\frac{pq}{n}

  • What conditions are required for a valid large-sample CI for p?

    • a random sample is selected from the target population

    • the sample size n is large

      • np > 15

      • nq > 15

  • What is the Wilson CI for a population proportion?

    • p̃\pm z_{\frac{\alpha}{2}}\sqrt{\frac{p\left(̃1-p̃\right)}{n+4}}

      • use with extremely large sample size

6.5 - Determining the Sample Size
  • Estimating a Population Mean

    • To estimate a population mean, z_{\frac{\alpha}{2}}\left(\frac{\sigma}{\sqrt{n}}\right)=SE

         then, n = \left\lbrack\frac{\left(z_{\frac{\alpha}{2}}\right)\sigma}{SE}\right\rbrack^2

    • Find standard deviation by using s or R/4

  • Estimating a Population Proportion

    • To find the sampling error,

      • z_{\frac{\alpha}{2}}\sqrt{\frac{pq}{n}}=SE

      • then, n = \frac{\left(z_{\frac{\alpha}{2}}\right)^2\left)\left(pq\right)\right.}{\left(SE\right)^2}

6.6 - Simple Random Sampling
  •  Finite Population Correction for Simple Random Sampling

    • What is the simple random sampling with finite population of size N?

      • Estimation of the Population Mean

      • Estimated standard error:

        • \sigma_{(x̄)}=\frac{s}{\sqrt{n}}\sqrt{\left(\frac{N-n}{N}\right)}     

          • approximate 95% CI x̄\pm2\sigma_{x̄}

      • Estimation of Population Proportion

      • Estimated standard error:

        • \sigma_{p̂}=\sqrt{\left(\frac{p̂\left(1-p̂\right)}{n}\right)}\sqrt{\frac{N-n}{N}}

          • approximate 95% CI p̂\pm2\sigma_{p̂}

6.7 - Confidence Interval for a Population Variance
  • What is the 100(1-\alpha) Confidence Interval for \sigma^2

    • \frac{\left(n-1\right)s^2}{X^2\frac{\alpha}{2}}\le\sigma^2\le\frac{\left(n-1\right)s^2}{x^2\left(1-\frac{\alpha}{2}\right)}

  • What are the conditions required for a valid CI for \sigma^2?

    • a random sample is selected from the target population

    • the population of interest has a relative frequency distribution that is approximately normal

Chapter 7

7.1 - The Elements of a Test of Hypothesis
  • What is a statistical hypothesis?

    • a statement about the numerical value of a population parameter

  • What are the two hypotheses?

    • null hypothesis: represents the status quo to the party performing the sampling experiment

    • alternative/research hypothesis: which will be accepted only if the data provide convincing evidence of its truth

  • H_0=,\ge,\le

  • H_{a}\ne,<,>

  • The null hypothesis has to be proven false

  • The alternative hypothesis has to be proven to be true

  • If the population size is not large enough to use CLT, we have to compute a test statistic to find the hypothesis

    • Z=\frac{x̄-\mu}{\sigma_{x̄}}

  • Test statistic: a sample statistic to decide between the null and alternative hypotheses: decides if rejecting the null hypothesis occurs

  • What is a Type 1 Error?

    • the researcher rejects the null hypothesis in favor of the alternative hypothesis when null hypothesis is actually true. Probability of this occurring is denoted by \alpha

  • What is the rejection region?

    • the set of possible values of the test statistic for which the researcher will reject the null hypothesis in favor of the alternative hypothesis

  • What is a Type 2 Error?

    • the researcher failing to reject (accepts) the null hypothesis when it should be rejected

    • denoted by \beta

  • Conclusions and Consequences for a test of Hypothesis

  • Be careful not to accept H_0, as the measure of reliability = \beta=P\left(TypeII\right) is almost always unknown, so we say fail to reject H_0 instead

7.2 - Formulating Hypotheses and Setting up the Rejection Range
  • Forms of alternative hypothesis

    • One tailed, upper tailed 

      • H_{\alpha}:\mu>2,400$

    • One tailed, lower tailed 

      • H_{\alpha}:\mu<2,400

    • Two-tailed

      • H_{\alpha}:\mu\ne2,400

  • From now on, null hypothesis will always be set to an equal sign 

  • Key Words for upper-tailed: 

    • greater than, larger, above 

  • Key Words for lower-tailed: 

    • less than, smaller, below 

  • Key Words for two-tailed: 

    • not equal to, differs from 

7.3 - Observed Significant Levels: p-Values 
  • What is the p-value? 

    • probability of observing a value of the test statistic that is at least as contradictory to the null hypothesis and supports the alternative hypothesis 

  • If a p-value is upper tailed, take .5 - t-statistic 

  • If a p-value is lower tailed, just use t-statistic 

  • If a p-value is two-tailed, take t-statistic/2 

  • If a p-value is less than alpha, we generally reject the null hypothesis 

  • If a p-value is greater than alpha, we generally fail to reject the null hypothesis 

7.4 - Test of Hypothesis About a Population Mean: Normal (z) statistic

  • The test statistic we use depends on the sample size 

    • If large, n > 30 

    • If small, n < 30 

    • Or if value of population standard deviation is unknown 

  • Large Sample Size 

    • CLT guarantees that the sampling distribution will be normal, so z-statistic is used.   

    • Conditions Required for a Valid Large-Sample Hypothesis Test

            1. A random sample is selected from the target population 

            2. The sample size n is large (n>30), guaranteeing it will be             normal 

  • What are the possible conclusions for a test of hypothesis? 

    • If the calculated test statistics falls in the rejection region, alpha > p-value, reject Ho and conclude Ha is true. 

    • If the test statistic does not fall in the rejection region, conclude that the sampling experiment does not provide sufficient evidence to reject Ho at the level of significance 

7.5 - Test of Hypothesis About a Population Mean: T-statistic 

  • When a sample is small, we use t-statistic:

    • Formula: 

  • What are the conditions required for a valid small-sample hypothesis test? 

    • A random sample is selected from the target population 

    • The population from which the sample is selected has a distribution that is normal 

  • Small-sample inferences typically require more assumptions and provide less information about the population parameter than large-sample inferences. 

7.6 - Large-Sample Test of Hypothesis About a Population Parameter 

  • Inferences about population proportions (or percentages) are often made in the context of the probability, p, of “success” for a binomial distribution 

  • What are the conditions required for a valid large-sample hypothesis test for p? 

    • A random sample is selected from a binomial population 

    • The sample size n is large enough so that both np > 15 and nq > 15 

  • Sample Samples 

    • Tests for a population proportion based on the z-statistic may not be valid - especially when conducting one-tailed tests, which is why we use the exact binomial tests for them 

7.7 - Test of Hypothesis About a Population Variance 

  • What are the conditions required for a valid hypothesis test for variance? 

    • A random sample is selected from the target population 

    • The population from which the sample is selected has a distribution that is approximately normal 

7.8 - Calculating Type II Error Probabilities 

  • Steps for Calculating Beta for a Large-Sample Test 

    • Calculate values of xbar: 

    • Convert the Xbaro to z-value: 

  • Sketch the alternative distribution and share the area in the nonrejection region and use the z-statistics and Table II to find the B (shaded area) 

  • The power of a test is the probability that the test will correctly lead to the rejection of the null hypothesis. 

    • Power is equal to (1-B) 

Summary of Chapter 7: 

  • Key words for identifying the target parameter 

    • μ - mean, average 

    • p - proportion, fraction, percentage, rate, probability 

    • σ² - variance, variability, spread 

  • Elements of a Hypothesis Test 

    • Null Hypothesis (H0): A statement that there is no effect or no difference, used as a starting point for statistical testing.

    • Alternative Hypothesis (H1 or Ha): The statement that indicates the presence of an effect or a difference, which researchers aim to support.

    • Significance Level (α): The threshold for rejecting the null hypothesis, commonly set at 0.05 or 0.01.

    • Test Statistic: A standardized value calculated from sample data during a hypothesis test.

    • p-value 

    • conclusion 

  • Probabilities in Hypothesis Testing 

    • a = P(Type I Error) = P(reject Ho when Ho is true) 

    • B = P(Type II Error) = P(accept Ho when Ho is false) 

    • 1 - B = Power of a Test = P(reject Ho when Ho is false) 

  • Forms of Alternative Hypothesis 

    • Lower-tailed: Ha: μ < μo 

    • Upper-tailed: Ha: μ > μo 

    • Two-tailed: Ha: μ \ne $$ μo 

  • Using p-Values to Make Conclusions 

    • 1. Choose significance level (a) 

    • 2. Obtain p-value of the test 

    • 3. If a > p-value, reject Ho 

  • Guide to selecting a one-sample hypothesis: 

Chapter 8 

8.1 - Identifying the Target Parameter 

  • Determining the Target Parameter: 

8.2 - Comparing Two Population Means: Independent Sampling 

  • In the large-sample case, we use the z-statistic, while in the small-sample case we use the t-statistic 

  • Properties to Know: 

  • Large, Independent Sample Information: 

8.3 -  Compaing Two Population Means: Paired Difference 

  • Utilizes test statistic for “one sample” 

  • Paired difference experiments are generally more accurate than independent sample experiments 

  • A paired difference experiment is never obtained by pairing the sample observations after the measurements have been acquired 

8.4 -  Compaing Two Population Proportions: Independent Sampling 

8.5 - Determining the Required Size 

  • To estimate (u1-u2), we use the equal sample size case: 

  • To estimate (p1-p2), we use this version of the equal sample size case: 

8.6 - F-tests 

  • Use to compare variances between populations 

  • What conditions are required? 

    • Normally distributed 

    • Random sampling