two types of studies

observational and experimental

observational study

making observations with unmanipulated variables; making correlations

experimental study

making observations with a manipulated variable; investigating cause and effects

controlled experiment

changes one variable at a time

independent / explanatory variable

variable being manipulated

dependent variable / response variable

produces data necessary to support or refute hypothesis

constants

aspects of the experiment kept the same

confounding variable

a variable that links the independent and dependent variable and could be affecting the outcomes

experimental units

objects being experimented on; humans referred to as subjects

four principles of experimental design

comparison, random assignment, controls / constants, replication

comparison

the experiment compares two or more treatments

random assignment

using chance to assign treatments to groups

controls / constants

variables kept the same in an experiment

replication

using enough experimental units in an experiment

placebo effect

any outcomes from the dummy treatment

double-blind experiment

researcher and subject are unaware of which treatment is which

population

entire group we want information about

census

collects data from every individual in the population

sample

subset of individuals from population which we actually collect data from

convenience sample

choosing easy to reach individuals from the population

voluntary response surveys

people decide whether to join the sample; creates bias since strong-opinioned people will only want to participate

simple random sampling

every group of n individuals in population has an equal chance to be selected (hat method)

stratified random sampling

classify population into homogenous groups (strata) and take SRS of them

cluster sampling

classify groups into heterogenous groups (geographically) and take SRS of all the clusters; cluster chosen must be used

systemic sampling

randomly choose starting point in population and select every kth member

undercoverage

occurs when some members of the population could not be chosen

nonresponse

when participant of experiment cannot be reached or refuses to participate

response bias

the person asking questions could potentially affect data (systemic incorrect responses)

wording of question

manner of question asked could potentially affect answer

qualitative data

shown in bar graphs, frequency / relative frequency tables, pie chart

quantative data

shown in histograms, stem plots, dot plots, box and whisker plots

two-way table

two categorical variables organized according to a row and column variable

marginal distribution

using the "margins" of the data

mean

average of the data (use x with bar from sample mean and fancy u for population mean)

median

midpoint of distribution when data is arranged smallest to largest

interquartile range (IQR)

middle half of the data (Q3 - Q1)

five number summary

minimum, Q1, median, Q3, maximum

two ways to measure spread

IQR using quartiles and median; standard deviation using mean

standard deviation

measures average distance of observations from mean; calculated by finding average of the squared distances and taking square root (s for sample and fancy o for population)

skewed right

when data values are concentrated on left and less values are on the right; mean is greater than median (dinosaur tail points right)

skewed left

when data values are concentrated in right and less values on the left; mean is less than median (dinosaur tail points left)

symmetric

when data values are centered; mean and median is the same

