1/30
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Data
Collections of observations (measurements, survey responses, etc)
Statistics
Science of planning studies, obtaining data, interpreting data
Population
Complete collection of all measurements/data being considered (we make inferences about this)
Census
the collection of data from every member of the population
sample
a subset of the population
statistical significance
a statistical statement of how likely it is that an obtained result occurred by chance (if something is statistically significant then it probably occurred NOT because of chance)
practical significance
real-world importance (52/100 is not practically significant)
Correlation does not equal
causation
voluntary response sample
A sample which involves only those who want to participate in the sampling (flawed)
major pitfalls in data analysis
misleading conclusions, untruthful reported data, loaded questions, order of questions, non response, percentages
parameter
numerical measurement describing a characteristic of a population
statistic
a numerical measurement describing some characteristic of a sample
big data
a broad term for datasets so large or complex that traditional data processing applications are inadequate.
Data Science
An interdisciplinary field involving the design and use of techniques to process very large amounts of data from a variety of sources and to provide knowledge based on the data.
missing data
1) at random
2) not at random (there's a cause)
correcting for missing data
1) delete cases
2) impute missing values
quantitative data
numerical data; need units of measurement (age)
categorical data
Data that consists of names, labels, or other nonnumerical values (gender, religion)
discrete data
Numerical data values that can be COUNTED
continuous data
Numerical data values that can be MEASURED. Infinite & not capped (liquid)
ratio level of measurement (quantitative)
A measurement of a variable in which the numbers indicating a variable's values represent fixed measuring units and an absolute zero point (height, length, distance)
interval level of measurement
A measurement of a variable in which the numbers indicating a variable's values represent fixed measurement units but have no absolute, or fixed, zero point (body temp, years)
ordinal level of measurement
classifies data into categories that can be ranked; however, precise differences between the ranks do not exist (grades, satisfaction)
nominal level of measurement
characterized by data that consist of names, labels, or categories only, and the data cannot be arranged in an ordering scheme (such as low to high)
Gold Standard
randomization w/ placebo and treatment groups is effective
good design for experiments
1) replication - large sample sizes
2) blinding - subject doesn't know if they are receiving placebo or real treatment
3) randomization
simple random sample
every member of the population has a known and equal chance of selection
systematic sample
obtained by selecting every kth individual from the population
convenience sample
only members of the population who are easily accessible are selected
stratified sample
a sample drawn in such a way that known subgroups within a population are represented in proportion to their numbers in the general population
cluster sample
obtained by selecting all individuals within a randomly selected collection or group of individuals