June 5 unit 6: confidence intervals

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/24

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

25 Terms

1
New cards

statistical inference

2
New cards

statistical inference

  • provides methods for drawing conclusions about a population from sample data

  • we can never be certain that our sample data fairly represents the population

  • to quantify this uncertainty, in statistical inference, we use the language of probability

  • just like with probability, the foundation of inference lies on long-run predictable behaviour

  • by taking “good“ samples (e.g. SRS), we can draw conclusions with a high probability of being correct

    • e.g. we can use the sample mean X-bar as an estimator for the population mean u… how good of an estimator is X-bar

3
New cards

statistical inference pt. 2

  • (BEEG STAR) the probability that X-bar = u is equal to 0, because of continuity (X-bar is a continuous variable!)

  • sample mean tells us nothing as to how accurate we believe our estimate to be

  • instead, we would like to use the sample mean to construct an interval of values to estimate the population mean u

4
New cards

confidence intervals

  • we know that the sample mean will vary from sample to sample. suppose we were to take many samples of the same size, n.

  • we would like to construct an interval in such a way that u is contained in the interval for most samples… what

  • that is, we would like to be confident that the interval we construct contains the value of the parameter (mean, u) we are trying to estimate

5
New cards
  • if x-bar is within 1.96 standard deviations of u, then u is also within 1.96 standard deviations of x-bar

  • in other words, in 95% of all samples, u lies within [x-bar +- z formulaaa…]

    • estimate +- margin of error

6
New cards

interpretation of the 95% confidence interval for u:

  • + true mean of the population

“If we repeatedly took simple random samples of the same size from the same population and constructed the interval in a similar manner, 95% of all such intervals would contain the true mean u of the population“

7
New cards

each confidence interval has associated with it a confidence level C, (whihc gives the probability that the interval will contain the true value of the population mean u)

  • for e.g., a 95% confidence interval has a confidence level of 95%

we choose the confidence level ourselves

  • since our goal is typically to estimate a parameter with a high probability of being correct, we always use a high confidence level (usually 90% or higher)

the form of a general level C confidence interval for our population mean u is x-bar += z* (o/sqrtn) where:

  • z* is the value of Z such that:

    • P(-z* <= Z <= z*) = C

  • C% of observations fall within z* std. dev of u

  • C% of values of X-bar fall within z*(o/sqrtn) of u

  • C% of constructed intervals contain u

8
New cards

probability = C

  • probability = (1-C)/2

9
New cards

we can find z* for any confidence level C from Table 1

  • the values of z* for the most common confidence levels are given in the last row of Table 2

  • the values z* that mark off a specific area under the standard normal curve are called the critical values of the distribution

10
New cards

when confidence level increases, the margin of error also increases

  • thus if we increase the confidence level, we must sacrifice our precision of estimation!

  • if we want to be more sure that our interval contains u, then we have to expand the interval!

11
New cards

interpretation (18.26, 25.22) C = 95%

“if we were to take repeated samples of 40 employees and compute the interval in a similar manner, then 95% of such intervals would contain the true mean hourly wage”

12
New cards

central limit theorem

X-bar ~(dot) N(u, o), apply when n >= 30 (unit 5)

  • if not told about distribution shape, and less than 30 (less than 30 what, sample size), cant calculate.

13
New cards

how to reduce the length of the interval without sacrificing our precision of estimation?

  • increase the sample size!!!!

  • lower margin of error!!!!

    • increasing the sample size by a factor of k reduces the margin or error by a factor of (sqrtk).

    • m(new) = z* o/sqrtn*k = 1/sqrtk(z* o/sqrtn) = m(old)/sqrtk

14
New cards

population mean = u

  • is a fixed value

  • if u is between interval, probability is 100% within

  • if u is not, probability is 0%

  • raining or not raining? (when its already raining)

confidence interval interpretation template:

  • if we repeatedly took samples of the same size from the same population and constructed the intervals in a similar manner, then C% of such intervals would contain the population mean u. *write in context of the question

15
New cards

sample mean = x-bar

16
New cards

know difference between sample size and population size

… formula doesn’t care about population

17
New cards

when collecting a sample, always consider the purpose of our data collection

  • often, we would like to achieve a certain precision of estimation (I.e. a particular margin of error)

  • to accomplish this, we need to find out how large our sample size needs to be

  • n = (z* o/m)² … always round up

18
New cards

n = (z* o/m)²

  • k!!!!!

19
New cards

k

  • sample size

  • (STAR) if we want to divide the margin of error byk, we need a sample zise that is k² times the original sample size

  • (STAR) to reduce the margin of error to one third its original value (I.e. reduce it by a factor of 3), then we need 9 times more individuals in our sample

20
New cards

our formula for the confidence interval holds only if the data were collected using a SRS. good formulas cannot rescue us from poor sampling methods

  • since the sample mean is strongly influenced by outliers, the confidence interval is also strongly influenced by outliers

21
New cards

we are using the true population standard deviation o in our calculations

  • in practice, this is not a realistic assumption

  • we are making this unreasonable assumption now to establish the framework for building confidence intervals

22
New cards

the margin of error covers only random sampling errors

  • it does not reflect any degree of undercoveragem nonresponse, or other forms of bias

  • i.e. “error“ is a reflection of only the inherent variation in the population (quantified by o). “error“ does not mean that we made a mistake!

23
New cards

slides over

24
New cards
25
New cards