Stats
Descriptive statistics is organizing and summarizing data, mostly through graphs and tables.
Inferential statistics uses methods that take a result from a sample, extend it to the population, and measure the reliability of the result
A population is a set of individuals that we are interested in studying
A parameter is the numerical summary of a population
A variable is a characteristic or property of an individual in the population. Ex. model of a car
A sample is a subset of the individuals of a population; it is from the individuals being studies
A statistical inference is an estimate, prediction, or some other generalization about a population based on information contained in a sample
Sample is the same thing as a statistic
population is the same thing as a parameter
measure of reliability is a statement about the degree of uncertainty associated with a statistical inference
Quantitative data are numerical measures of individuals; they can be added or subtracted, and provide meaningful results
Qualitative/categorical data are measurements that cannot be measured on a numerical scale; only be classified into groups of categories
Continuous variables can take any number in an interval (weight, age)
Discrete variable can only take a more limited number of possible values, such as whole numbers (counts)
Nominal is the values of the variable name, label, or categories of the individuals. Ex. hair color. No order, no zero
Ordinal has the same properties as nominal, but the naming scheme allows for the values of the variable to be ranked in a certain order. Ex: class rank.
Interval is the same properties as nominal, but the distance between different attributes/values has meaning. There is addition and subtract but no true 0. Ex. weather
Ratio is a type of measurement, but the ratio of the values of the variables has meaning. A value of zero has the meaning of the absence of the attribute. Ex: age, weight. There are addition, subtraction, multiplication, and division.
An observational study is a data collection method where the researchers are observed in their natural setting. Ex: survey. Can be retrospective and prospective
Cross-sectional takes a snapshot of info at a particular time
A Case control study is retrospective in nature. Researchers compare two individuals and look at the time to determine what differences may have led to different outcomes
A cohort study is a prospective, study in which researchers select a group of individuals (cohort) and observe them for a long time.
A designed experiment is a data collection method where the researcher exercises full control over the characteristics of the individuals sampled. They can specify the treatment being used to control factors that might create bias in the data
Confounding in a study occurs when the effects of two or more explanatory variables are not separated. This means that any apparent relationship between explanatory variable and response variable may be due to other variables in or out of the study
Lurking Variable is an explanatory variable that was not considered in the study but affects the value of the response variable in the study. They are related to explanatory values in the study.
A confounding variable is an explanatory variable that was considered in a study whose effects cannot be distinguished from a second explanatory variable in the study.