1/9
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Analytic plan by study approach
Studies w/ no comparison group (case series + cross-sectional surveys): simple statistics like counts (frequencies), proportions and averages are sufficient
Studies comparing 2+ populations (case-control. cohort, experimental): description must be completed first THEN comparative statistics will be calculated
Variable definition
Not consistent or having fixed pattern, liable to change. Is a characteristic recorded for subjects in a study that can be assigned more than one value.
Independent: 1 thing you change in an experiment
Dependent: change that happens because of independent variable
Controlled: everything that remains constant/unchanged
Quantitative vs Qualitative
Qualitative: measures or observations of qualities, types or characteristics often not numbers but in text form.
Quantitative: values/counts expressed as numbers and can be compared on numeric scale.
Categorical data
Nominal and ordinal.
Variable that represents characteristics (e.g. gender, language, ethnicity). and can be represented in a dataset by numbers
categorical: variable is this if each observation belongs to 1 of a set of categories
Nominal: Unordered data representing discrete units + used to label a variable w/ no quantitative value, which also means that it has no natural value/rank (e.g., if you change order of data, meaning of data doesn’t change)
Ordinal: discrete units or categories w/ natural order/ranking to it (e.g. elementary > middle school > high school).
Numerical data
Numbers w/ real mathematical meaning
discrete: data separated such that this data can only take on certain value, being counted only in whole units (e.g. how many people did you text? cannot respond with 1.5 people)
Continuous: opposite of discrete and can be only measured and not counted through some kind of instrument (e.g., how heavy am I? requires scale to record data)
This kind of data can also be described in interval and ratio data:
Interval: no true zero (e.g., 0F does not mean that there is no temperature and scale stops), representing values ordered on scale w/ difference b/w any 2 values having meaning.
Ratio: holds all properties of interval variable, relating numbers that can be ordered on a scale. Has a TRUE ZERO (e.g., if variable is 0, means that there s none of that variable).
Quantitative vs Categorical
Key features of quantitative = center, aka. central tendency (mean, median, mode) + spread (variability), as descriptive statistics are often used to describe the average value of a variable in a population.
Mean = sum of all values divided by number of values (least robust)
Median = value in middle when all values are arranged in ascending order w/ taking both middle values if possible. In b/w mean and mode.
Mode = value appearing most common (most robust)
Key features of categorical = % of observations in each category
Measures of spread
Data w/I set spread out/scattered about the mean (e.g., variability, dispersion, scatter). Further distance of data values from center = greater spread, opposite being “small spread”. Used to describe variability and distribution of kinds of data:
mini/maximum
range
quartiles
deciles
IQR
Can be shown in following graphs:
histogram
boxplot
bar chart
pie chart
Normal vs skewed curves, variance and standard deviation
Normal = Histogram showing normal distribution (aka Gaussian distribution) or approx normal distribution of responses/data will have bell shaped curve w/ 1 peak in middle.
Skewed = extend farther from peak on either left or right side of histogram
Three ways quantifying narrowness or wideness of distribution
variance
standard deviation
Describes narrowness or wideness of range of responses with variables w/ relatively normal distribution. Z-scores indicate how many SD away from sample mean an individual’s response is (e.g., age exactly @ mean age = score of 0. Above mean = 1. Below mean = -2).
standard error
Reporting descriptive stats
For ratio + interval variables w/ normal distribution = both mean and SD are typically reported
ordinal variables = median and IQ range are often reported
categorical variables = proportions of participants w/ responses used to describe populations