Biostatistics, Chapters I & II
Sampling
Population: complete collection of all measurements or data that are being considered.
Sample: sub-collecion of members selected from a population
Simple Random Sample: each member of the population has the same change of being included, and samples are chosen independently
Cluster Sampling: dividing the population into groups by a category. All of the individuals within the single group are the sample.
Stratified Random Sampling: divide the population into groups (strata) based on one+ classification criteria. Then perform a simple random sample within each strata
Sampling Bias: some members of the population have a higher chance to be selected than others.
Variables
Categorical Variables: two+ categories, but no intrinsic ordering (ex: blood type)
Ordinal Variable: categorical variables but with a clear ordering (small/medium/large)
Numeric Variables
Discrete Variables: a numeric variable for which we can list the possible values (think: integers)
Continuous Variable: a numeric variable that is measured on a continuous scale (temperature, height)
Bar Charts: frequency distribution for categorical variables
Histograms: frequency distribution but no spaces
Frequency Variables
Box Plots
Quartiles
Fences
Drawing a Box Plot
Central box from Q1 to Q3
Line in the middle is Q2
Whiskers extend to the point CLOSEST to the LF & UF (not the actual values of the fences)
Outliers are marked by small circles
Label y axis
Variance