Stats Exam #2 (Ch 5-8)

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/72

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

73 Terms

1
New cards

random variable

a numerical description of the outcome of an experiment

2
New cards

discrete random variable

may assume either a finite number of values or an infinite sequence of values

3
New cards

continuous random variable

may assume any numerical value in an interval or collection of intervals

4
New cards

probability distribution

for a random variable, describes how probabilities are distributed over the values of the random variable

  • can describe a discrete one w a table, graph, or formula

5
New cards

two types of discrete probability distributions

  1. uses the rules of assigning probabilities to experimental outcomes to determine probabilities for each value of the random variable

  2. uses a special math formula to compute the probabilities for each value of the random variable

6
New cards

probability function (discrete prob dist)

provides the probability for each value of the random variable

  • f(x) > or = 0

  • sum of f(x) = 1

7
New cards

several discrete probability distributions specified by formulas

  • discrete-uniform

  • binomial

  • poisson

  • hypergeometric

  • in addition to tables and graphs, a formula that gives the probability function, f(x), for every value of x is often used to describe the probability distributions

8
New cards

discrete uniform probability distribution

  • the simplest example of a discrete probability distribution given by a formula

  • f(x) = 1/n

  • n is the number of values the random variable may assume (the values of the random variable are equally likely)

9
New cards

expected value

  • mean of a random variable, is a measure of its central location

  • a weighted average of the values the random variable may assume. the weights are the probabilities

  • does not have to be a value the random variable can assume

10
New cards

variance

summarizes the variability in the values of the random variable

  • sigma squared

  • a weighted average of the squared deviations of a random variable from its mean

  • the weights are the probabilities

11
New cards

standard deviation

  • sigma

  • the positive square root of the variance

12
New cards

four properties of a binomial experiment

  1. the experiment consists of a sequence of n identical trials

  2. two outcomes, success and failure, are possible on each trial

  3. the probability of a success, denoted by p, does not change from trial to trial (stationarity assumption)

  4. the trials are independent

13
New cards

binomial probability distribution

  • our interest is in the # of successes occurring in n trials

  • we let x denote the # of successes occurring in n trials

  • =binom.dist

14
New cards

expected value (binomial dist)

E(x) = u = np

15
New cards

variance (binomial dist)

np (1-p)

16
New cards

poisson probability distribution

  • random variable is often useful in estimating the # of occurrences over a specified interval of time or space

  • it is a discrete random variable that may assume an infinite sequence of values (x = 0,1,2,….)

  • ex:

    • # of knotholes in 14 linear feet of pine board

    • # of vehicles arriving at a toll booth in one hour

  • specified interval of time or space. infinite sequence of values

17
New cards

two properties of a poisson experiment

  1. the probability of an occurrence is the same for any two intervals of equal length

  2. the occurrence or nonoccurrence in any interval is independent of the occurrence or nonoccurrence in any other interval 

  • mean and variance are equal

18
New cards

poission probability function

  • since there’s no stated upper limit for the # of occurrences, the probability function f(x) is applicable for values x= 0,1,2… w out limit

  • in practical applications, x will eventually become large enough so that f(x) is approximately zero and the probability of any larger values of x becomes negligible

  • = poisson.dist

19
New cards

types of continuous probability distributions

  • uniform probability distribution

  • normal probability distribution

  • exponential probability distribution

<ul><li><p>uniform probability distribution</p></li><li><p>normal probability distribution</p></li><li><p>exponential probability distribution</p></li></ul><p></p>
20
New cards

continuous random variable

  • can assume any value in an interval on the real line or in a collection of intervals

  • it’s not possible to talk about the probability of the random variable assuming a particular value

  • instead, we talk about the probability of the random variable assuming a value w in a given interval

  • the probability of the random variable assuming a value w in some given interval from x1 to x2 is defined to be the area under the graph of the probability density function btw x1 and x2

<ul><li><p>can assume any value in an interval on the real line or in a collection of intervals</p></li><li><p>it’s not possible to talk about the probability of the random variable assuming a particular value</p></li><li><p>instead, we talk about the probability of the random variable assuming a value w in a given interval </p></li></ul><p></p><ul><li><p>the probability of the random variable assuming a value w in some given interval from x1 to x2 is defined to be the area under the graph of the probability density function btw x1 and x2</p></li></ul><p></p><p></p>
21
New cards

uniform probability distribution

  • a random variable is uniformly distributed whenever the probability is proportional to the interval’s length

  • probability density function: 

    • f(x) 1/(b-a)

    • a is the smallest value the variable can assume

    • b is the largest value the variable can assume

<ul><li><p>a random variable is uniformly distributed whenever the probability is proportional to the interval’s length</p></li><li><p>probability density function:&nbsp;</p><ul><li><p>f(x) 1/(b-a) </p></li><li><p>a is the smallest value the variable can assume</p></li><li><p>b is the largest value the variable can assume</p></li></ul></li></ul><p></p>
22
New cards

expected value & variance of x for uniform probability distribution

  • E(x) = (a+b)/2

  • Var(x) = (b-a)²/12

23
New cards

area as a measure of probability

  • the area under the graph of f(x) and probability are identical

  • this is valid for all continuous random variables

  • the probability that x takes on a value btw some lower value x1 and some higher value x2 can be found by computing the area under the graph of f(x) over the interval from x1 to x2

24
New cards

normal probability distribution

  • the most important distribution for describing a continuous random variable

  • widely used in statistical inference

  • ex:

    • heights of ppl

    • rainfall amounts

    • test scores

    • scientific measurements

  • doctrine of chances Abraham de Moivre

<ul><li><p>the most important distribution for describing a continuous random variable</p></li><li><p>widely used in statistical inference</p></li><li><p>ex:</p><ul><li><p>heights of ppl</p></li><li><p>rainfall amounts</p></li><li><p>test scores</p></li><li><p>scientific measurements</p></li></ul></li><li><p>doctrine of chances Abraham de Moivre</p></li></ul><p></p>
25
New cards

normal probability distribution characteristics

  • the distribution is symmetric; its skewness measure is zero 

  • the entire family of normal probability distributions is defined by its mean u and its standard deviation sigma

  • the highest point on the normal curve is at the mean, which is also the median and mode

  • the mean can be any numerical value: negative, zero, or positive 

  • the standard deviation determines the width of the curve: larger values result in wider, flatter curves

  • probabilities for the normal random variable are given by areas under the curve. the total area under the curve is 1 (0.5 to the left of the mean and 0.5 to the right)

  • empirical rule:

    • 68.26% of values ±1

    • 95.44% of values ±2

    • 99.72% of values ±3

26
New cards

standard normal probability distribution characteristics

  • a random variable having a normal distribution w a mean of 0 and a st dev of 1 is said to have a standard normal probability distribution

  • the letter z is used to designate the st normal random variable

  • we can think of z as a measure of the # of st dev away from mean

27
New cards

z-score

  • standardized value

  • it denotes the number of st dev a data value xi from the mean

  • =standardize

28
New cards

excel for normal distributions

  • norm.s.dist is used to compute the cumulative probability given a z value

  • norm.s.inv is used to compute the z value given a cumulative probability

  • the s reminds us that they relate to the standard normal probability distribution

  • norm.dist is used to compute the cumulative probability given an x value

  • norm.inv is used to compute the x value given a cumulative probability

29
New cards

exponential probability distribution

  • useful in describing the time it takes to complete a task

  • ex:

    • time btw vehicle arrivals at a toll booth

    • time required to complete a questionnaire

    • distance btw major defects in a highway

  • in waiting line applications, the exponential distribution is often used for service times

  • mean and st dev equal

  • skewed right w a measure of 2

<ul><li><p>useful in describing the time it takes to complete a task</p></li><li><p>ex:</p><ul><li><p>time btw vehicle arrivals at a toll booth </p></li><li><p>time required to complete a questionnaire</p></li><li><p>distance btw major defects in a highway</p></li></ul></li><li><p>in waiting line applications, the exponential distribution is often used for service times</p></li><li><p>mean and st dev equal</p></li><li><p>skewed right w a measure of 2</p></li></ul><p></p>
30
New cards

excel exponential probabilities

  • expon.dist function can be used to compute exponential probabilities

  • 3 arguments: 

    • 1st : the value of the random variable x

    • 2nd: 1/u (the inverse of the mean # of occurrences in an interval)

    • 3rd: true or false (always true)

31
New cards

relationship btw the poisson and exponential distributions

  • the poisson distribution provides an appropriate description of the # of occurrences per interval

  • the exponential distribution provides an appropriate description of the length of the interval btw occurrences

<ul><li><p>the poisson distribution provides an appropriate description of the # of occurrences per interval</p></li><li><p>the exponential distribution provides an appropriate description of the length of the interval btw occurrences</p></li></ul><p></p>
32
New cards

discrete probability distributions summary

knowt flashcard image
33
New cards

continuous probability distributions summary

knowt flashcard image
34
New cards

element

the entity on which data are collected

35
New cards

population

a collection of all the elements of interest

36
New cards

sample

subset of the population

37
New cards

sampled population

the population from which the sample is drawn

38
New cards

frame

a list of the elements that the sample will be selected from

39
New cards

the reason we select a sample:

  • to collect data to answer a research question about a population

  • the sample results provide only estimates of the values of the population characteristics

  • the reason is simply that the sample contains only a portion of the population

  • with proper sampling methods, the sample results can provide “good” estimates of the population characteristics

40
New cards

finite populations

  • defined by lists such as:

    • organization membership roster

    • credit card account numbers

    • inventory product numbers

  • a simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected

41
New cards

sampling with replacement

replacing each sampled element before selecting subsequent elements

42
New cards

sampling without replacement

the procedure used most often

43
New cards

in larger sampling projects…

computer-generated random numbers are often used to automate the sample selection process

44
New cards

sampling from an infinite population

  • sometimes we want to select a sample, but find it’s not possible to obtain a list of all elements in the population

  • as a result, we can’t construct a frame for the population

  • hence, we cannot use the random number selection procedure

45
New cards

on-going process

  • populations are often generated by an ongoing process where there is no upper limit on the # of units that can be generated

  • some examples of on-going processes, with infinite populations, are:

    • parts being manufactured on a production line

    • transactions occurring at a bank

    • telephone calls arriving at a technical help desk

    • customers entering a store

46
New cards

a random sample from an infinite population

  • is a sample selected such that the following conditions are satisfied

    • each element selected comes from the population of interest

    • each element is selected independently

  • in the case of an infinite population, we must select a random sample in order to make valid statistical inferences about the population from which the sample is taken

47
New cards

point estimation

  • a form of statistical inference

  • we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter

  • point estimator

48
New cards

target population

the population we want to make inferences about

49
New cards

sampled population

the population from which the sample is actually taken

  • whenever a sample is used to make inferences about a population, we should make sure that the targeted population and the sampled population are in close agreement

50
New cards

process of statistical inference

knowt flashcard image
51
New cards

sampling distribution of x (sample mean)

  • the probability distribution of all possible values of the sample mean

  • expected value of sample mean = population mean

  • when the expected value of the point estimator equals the population parameter, we say the point estimator is unbiased

52
New cards

sampling distribution of sample mean

  • when the population has a normal distribution, the sampling distribution of x is normally distributed for any sample size

  • in most applications, the sampling distribution of x can be approximated by a normal distribution whenever the sample is size 30 or more

  • in cases where the population is highly skewed or outliers are present, samples of size 50 may be needed

  • the sampling distribution of x can be used to provide probability info about how close the sample mean x is to the population mean u 

53
New cards

central limit theorem

  • when the population from which we are selecting a random sample does not have a normal distribution, this theorem is helping in identifying the shape of the sampling distribution of x 

  • in selecting random samples of size n from a population, the sampling distribution of the sample mean x can be approximated by a normal distribution as the sample size becomes large

54
New cards

other sampling methods

  • stratified random sampling

  • cluster sampling

  • systematic sampling

  • convenience sampling

  • judgment sampling

55
New cards

stratified random sampling

  • the population is first divided into groups of elements called strata

  • each element in the population belongs to one and only one stratum

  • best results are obtained when the elements within each stratum are as much alike as possible (ex: a homogeneous group)

  • a simple random sample is taken from each stratum

  • formulas are available for combining the stratum sample results into one population parameter estimate

  • advantage: if strata are homogenous, this method is as “precise” as simple random sampling but w a smaller total sample size

  • ex: the basis for forming the strata might be department, location, age, industry type, and so on

56
New cards

cluster sampling

  • the population is first divided into separate groups of elements called clusters

  • ideally each cluster is a representative small-scale version of the population (ex: heterogeneous group)

  • a simple random sample of the clusters is taken

  • all elements within each sampled (chosen) cluster form the sample

  • ex: a primary application is area sampling, where clusters are city blocks or other well-defined areas

  • advantage: the close proximity of elements can be cost effective (ex: many sample observations can be obtained in a short time)

  • disadvantage: this method generally requires a larger total sample size than simple or stratified random sampling

57
New cards

systematic sampling

  • if a sample size of n is desired from a population containing N elements, we might sample one element for every n/N elements in the population

  • we randomly select one of the first n/N elements from the population list

  • we then select every n/Nth element that follows in the population list

  • this method has the properties of a simple random sample, especially if the list of the population elements is a random ordering

  • advantage: the sample usually will be easier to identify than it would be if simple random sampling were used

  • example: selecting every 100th listing in a telephone book after the first randomly selected listing

58
New cards

convenience sampling

  • it is a nonprobability sampling technique. items are included in the sample w out known probabilities of being selected

  • the sample is identified primarily by convenience

  • ex: a professor conducting research might use student volunteers to constitute a sample

  • advantage: sample selection and data collection are relatively easy

  • disadvantage: it’s impossible to determine how representative of the population the sample is

59
New cards

judgment sampling

  • the person most knowledgeable on the subject of the study selects elements of the population that he or she feels are most representative of the population

  • it is a nonprobability sampling technique

  • ex: a reporter might sample 3 or 4 senators, judging them as reflecting the general opinion of the senate

  • advantage: it’s a relatively easy way of selecting a sample

  • disadvantage: the quality of the sample results depends on the judgment of the person selecting the sample

60
New cards

recommendation

  • it is recommended that probability sampling methods (simple random, stratified, cluster, or systematic) be used

  • for these methods, formulas are available for evaluating the “goodness” of the sample results in terms of the closeness of the results to the population parameters being estimated

  • an evaluation of the goodness cannot be made w non-probability (convenience or judgment) sampling methods

61
New cards

a point estimator cannot…

  • be expected to provide the exact value of the population parameter

  • an interval estimate can be computed by adding and subtracting a margin of error to the point estimate (point estimate ± margin or error)

  • the purpose of an interval estimate is to provide info point how close the point estimate is to the value of the parameter

62
New cards

the general form of an interval estimate of a population mean is

sample mean ± margin of error

63
New cards

interval estimate of a population mean: sigma known

  • in order to develop an interval estimate of a population mean, the margin of error must be computed using either:

    • the population standard deviation sigma, or

    • the sample standard deviation s

  • sigma is rarely known exactly, but often a good estimate can be obtained based on historical data or other info

  • we refer to such cases as the sigma known case

  • there is a 1-a probability that the value of a sample mean will provide a margin of error or less

<ul><li><p>in order to develop an interval estimate of a population mean, the margin of error must be computed using either:</p><ul><li><p>the population standard deviation sigma, or</p></li><li><p>the sample standard deviation s</p></li></ul></li><li><p>sigma is rarely known exactly, but often a good estimate can be obtained based on historical data or other info</p></li><li><p>we refer to such cases as the sigma known case</p></li><li><p>there is a 1-a probability that the value of a sample mean will provide a margin of error or less</p></li></ul><p></p>
64
New cards

meaning of confidence

<p></p><p></p>
65
New cards

adequate sample size when sigma is known

  • in most applications, a sample size of n=30 is adequate

  • if the population distribution is highly skewed or contains outliers, a sample size of 50 or more is recommended

  • if the population is not normally distributed but is roughly symmetric, a sample size as small of 15 will suffice

  • if the population is believed to be at least approximately normal, a sample size of less than 15 can be used

66
New cards

interval estimate of a population mean: sigma unknown

  • if an estimate of the population standard deviation sigma cannot be developed prior to sampling, we use the sample standard deviation s to estimate sigma

  • this is the sigma unknown case

  • in this case, the interval estimate for u is based on the t distribution

67
New cards

t distribution

  • william gosset, writing under the name “student”, is the founder

    • oxford grad in math and worked for Guinness

    • developed it while working on small-scale materials and temperature experiments

  • the t distribution is a family of similar probability distributions

  • a specific t distribution depends on a parameter known as the degrees of freedom

  • a t distribution w more degrees of freedom has less dispersion

  • as df increases, the diff btw the t dist and the standard normal probability distribution becomes smaller and smaller

  • for more than 100 degrees of freedom, the standard normal z value provides a good approximation to the t value

  • the standard normal z values can be found in the infinite degrees row of the t distribution table

<ul><li><p>william gosset, writing under the name&nbsp;“student”, is the founder</p><ul><li><p>oxford grad in math and worked for Guinness</p></li><li><p>developed it while working on small-scale materials and temperature experiments</p></li></ul></li><li><p>the t distribution is a family of similar probability distributions</p></li><li><p>a specific t distribution depends on a parameter known as the degrees of freedom</p></li><li><p>a t distribution w more degrees of freedom has less dispersion</p></li><li><p>as df increases, the diff btw the t dist and the standard normal probability distribution becomes smaller and smaller</p></li><li><p>for more than 100 degrees of freedom, the standard normal z value provides a good approximation to the t value</p></li><li><p>the standard normal z values can be found in the infinite degrees row of the t distribution table</p></li></ul><p></p>
68
New cards

degrees of freedom

refer to the # of independent pieces of info that go into the computation of s

69
New cards

sigma unknown adequate sample size

  • usually a sample size of n=30 is adequate to develop an interval estimate of a population mean

  • if the population distribution is highly skewed or contains outliers, a sample size of 50 or more is recommended

  • if the population is not normally dist but is roughly symmetric, a sample size as small as 15 will suffice

  • if the population is believed to be at least approximately normal, a sample size of less than 15 can be used

70
New cards

summary of interval estimation procedures for a population mean

knowt flashcard image
71
New cards

sample size for an interval estimate of a population mean

  • let E = the desired margin of error

  • E is the amount added to and subtracted from the point estimate to obtain an interval estimate

  • if a desired margin of error is selected prior to sampling, the sample size necessary to satisfy the margin of error can be determined

72
New cards

planning value

  • the Necessary Sample Size equation requires a value for the population st dev

  • if sigma is known, a preliminary or planning value for sigma can be used in the equation

    • use the estimate of the population st dev computed in a previous study

    • use a pilot study to select a preliminary study and use the sample st dev from the study

    • use judgment or a best guess for the value of sigma

73
New cards

Explore top flashcards