Regression line- A straight line that summarizes the linear relationship between two numeric variables
Coefficient of determination- Describes the proportion of variation in y that can be explained by ŷ.
Confounded variables (Cannot have a single cofounded variable)- More than 1 as a group, their effect on the response cannot be distinguished
Lurking Variables (Can have a single variable lurking)- Individual Listings (ex. diet, lifestyle)
Sampling design- Describes how a sample is obtained from the pop.
Statistical- We use these samples to draw conclusions about the pop.
Simple Random Sampling (SRS)- A method in which every sample of size has an equal chance to be drawn
Stratified Sampling- When the pop is separated into groups that have a common characteristic called strata.
Strata- Subgroups within a pop that are separated based on a common characteristic.
Cluster Sampling- Separates the population into groups in some natural way
Clusters- Subgroups within a pop that are separated
Multistage Sampling- Takes a SRS from another SRS
Systematic Sampling/Sampling- Has individuals lined up in some way, picks a random starting point, and also adds every kth individual after
Poor Sampling Designs- Convenience Sampling and Voluntary Response Sampling
Convenience Sampling- Takes individual who are easily accessible (no clear method)
Voluntary Response Sampling- Allows individuals to choose whether or not they participate
Normal Distribution- Have the same general shape. However, two values define the actual shape.
M- Determines location, smaller in blue curve
O- Determines the spread/flatness, smaller in blue curve because more of the data is close to the mean (less spread)
Proportion or probability- The area under the curve is 1
Empirical Rule- 68%, 95%, 99%
68%- Data are within 1 s.d. of the mean
95%- Data are within 2 s.d. of the mean
99%- Data are within 3 s.d. of the mean
Impossible- A prob of 0
Guaranteed- A prof of 1