1/35
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Statistical investigation process
Ask a statistical question
Collect data
Analyze data
Draw conclusions
Anecdotal evidence
Personal observations, hearsay, or unusual examples to draw conclusions
Population
the set of individuals or things we wish to know about
Parameter
a numeric characteristic of the population
Data
Facts or piece of information that can be used to gain knowledge or make decisions
Variable
A characteristic that takes on different values across individuals or objects
Census
Data are collected from EVERY member of the population
Quantitative
Numerical variable
Qualitative
Categorical variable
Distribution
Pattern of the outcomes
Exploratory data analysis
the process of using summaries of the data to identify patterns, trends, and unexpected outcomes. AKA “data summary”
Describing center
The most common summaries are the mean and median
Describing ‘spread’ (variability)
Variance and standard deviation
Percentiles
split the data by percentages. The median is the 50th percentile
Quantiles
Percentiles in decimal values
Quartiles
Specific percentiles
Interquartile range (IQR)
Q3-Q1
Observational study
Observe subjects and measure variables of interest but do not intervene. Cannot show change in one variable causes a change in the other.
Association
does NOT = causation
Confounding variable
Associated with both the explanatory and response variables and can explain the relationship between them
Positive relationship
As one variable increases, so does the other
Negative relationship
As one variable increases, the other decreases
Strong relationship
Given a value of one variable, it is possible to make a good estimate of the value of the other
Weak relationship
Given a value of one variable, we still can’t say much about the value of the other
Correlation
A number that describes the strength and direction of a linear relationship
Correlation coefficent
Denoted “r”. -1<r<1.
Linear regression
Line that summarizes the bivariate data. This line can estimate values of the dependent variable.
Residual
Indicates the vertical distance between observed (actual) data point and the value predicted by a statistical model.
Population mean symbol
u
Population proportion symbol
μ
Population standard deviation
σ
Population variance symbol
σ²
Sample proportion symbol
p hat
sample standard deviation symbol
S
Sample variance
S²
Sample mean symbol
X bar