1/33
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
The Statistical Investigation Process
Identify a problem, Pose a statistical question, Collect or obtain data, Analyse Data, Communicate Results
Univariate Data
Data which involves one variable
Bivariate Data
Data which involves two variables. Bivariate data analysis looks at whether there is a relationship between two variables.
Association
A general term used to describe the relationship between two (or more) variables.
Correlation
Used interchangeably with the term association. Correlation tends to be used when referring to the strength of a linear relationship between two numerical variables.
Explanatory Variable (EV)
Variable used to explain or predict a difference in the RV.
Response Variable (RV)
What happens in response to the EV.
Example of EV and RV
When investigating the relationship between the temperature of a loaf of bread and the time it has spent in the oven, the temperature is the response variable and time is the explanatory variable.
Displaying Categorical Bivariate Data
Two-way table, Side by side column graph, Segmented column graph
Displaying Numerical Bivariate Data
Scattergraphs
Does EV go on x axis or y axis?
x axis
Does RV go on x axis or y axis
y axis
When do you use a column percentage table?
If the explanatory variable is at the top of the frequency table.
When do you use a row percentage table?
If the explanatory variable is on the side of the frequency table.
Words to use when commenting on trend
In general, tend to, seems to be
Rule for line of best fit
Line that joins as many points as possible but leaves the same amount of unconnected points on either side of the line.
Reliability of prediction can be affected by…
Form of scattergraph, number of points on scattergraph, how closely the points form a straight line, predicting between the given EV values or beyond given EV values.
Interpolation
Predicting between given values (Reliable)
Extrapolation
Predicting beyond given values (Unreliable)
When interpreting a graph comment on
Form, direction, strength of association
Form
Linear, non-linear, no relationship/random
Direction (if linear)
Positive, negative
Positive direction
As the explanatory variable increases, the response variable increases.
Negative direction
As the explanatory variable increases, the response variable decreases.
Strength
Strong relationship, moderate relationship, weak relationship
Equation of least squares regression line
y hat = ax + b
Interpreting the gradient (a)
The RV increases/decreases by a units for every one unit increase in the EV
Interpreting y intercept (b)
The RV is b units when the EV is zero (not always relevant)
Weak range Pearson’s correlation coefficient
0-0.35
Moderate range Pearson’s correlation coefficient
0.35-0.75
Strong range Pearson’s correlation coefficient
0.75-1
Commenting on r
r = xxxx which suggest strong/moderate/weak positive/negative correlation. As the EV increases/decreases, the RV increases/decreases.
Commenting on causation
Although there is a strong/moderate/weak positive/negative correlation between the variables, a correlation does not imply causation. There might be a coincidence or a lurking variable such as…
Commenting on coefficient of determination
xx% of the variation in the RV can be explained by the variation in the EV.