1/32
Vocabulary flashcards summarising the principal terms and definitions related to Unit 5 (Correlation) of the Basics of Statistics & Mathematics course.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Correlation analysis
A statistical technique used to measure the strength and direction of the relationship between two quantitative variables.
Coefficient of correlation (r)
A unit-less numerical value (−1 to +1) that quantifies the degree and direction of linear association between two variables.
Strength (of correlation)
Indicates how closely plotted data points cluster around a straight reference line; the nearer to ±1, the stronger the linear relationship.
Direction (of correlation)
Shows whether variables move together (positive) or in opposite directions (negative) when one variable changes.
Positive correlation
An association in which the values of two variables move in the same direction—both increase or both decrease.
Negative correlation
An association in which the values of two variables move in opposite directions—one increases while the other decreases.
Linear correlation
A relationship in which paired values change at a constant or proportional rate, forming a straight-line trend on a graph.
Non-linear (curvilinear) correlation
A relationship where paired values do not change proportionally; plotted points follow a curved pattern rather than a straight line.
Simple correlation
Study of the association between exactly two variables.
Partial correlation
Measurement of the association between two variables while holding the effects of additional influencing variables constant.
Multiple correlation
Study of the association involving more than two variables simultaneously.
Scatter diagram
A graph that plots paired values of two variables (x, y) to provide a visual impression of their relationship.
Covariance
A statistic measuring how two variables deviate together from their respective means; basis for Pearson’s r.
Karl Pearson’s correlation coefficient
A quantitative measure of linear association, computed as covariance divided by the product of the standard deviations of x and y.
Coefficient of determination (r²)
The proportion of variance in the dependent variable that is explained by the independent variable; ranges from 0 to 1.
Standard error of correlation (SEr)
An estimate of the sampling variability of r, given by SEr = √(1 − r²)/(√n).
Probable error of correlation (PEr)
A range estimate for the population correlation, calculated as 0.675 × SEr; used to judge the significance of r.
Spearman’s rank correlation coefficient (Rho)
A non-parametric measure of association for ordinal (ranked) data, defined as R = 1 − [6Σd² / n(n² − 1)].
Ordinal data
Categorical data placed in a meaningful order (rank) without equal or known intervals between categories.
Nominal data
Categorical data consisting of labels or names with no intrinsic order or quantitative value.
Auto-correlation coefficient
A statistic that measures correlation of a variable with itself across different time lags.
Lead time / Lag time
The time difference between a cause and its effect when analysing time-series relationships.
Hypothesis testing
A procedure that uses sample information to decide whether a statement about a population parameter should be accepted or rejected.
Null hypothesis (H₀)
The default assumption that no effect or no relationship exists; considered true until evidence suggests otherwise.
Alternate hypothesis (H₁)
The statement accepted if the null hypothesis is rejected; claims that a real effect or relationship exists.
Directional hypothesis
Specifies not only that a relationship exists but also its direction (e.g., positive, negative, greater than, less than).
Non-directional hypothesis
States that a relationship exists without specifying the direction of the effect.
Acceptance region
Range of test-statistic values for which the null hypothesis is not rejected.
Rejection (critical) region
Range of test-statistic values that lead to rejection of the null hypothesis.
Critical value
A tabled threshold that separates the acceptance region from the rejection region in hypothesis testing.
Parametric test
Statistical test that requires interval or ratio-scale data and assumes specific population distributions (e.g., t-test).
Non-parametric test
Statistical test that uses nominal or ordinal data and makes fewer distributional assumptions (e.g., chi-square, rank tests).
Standard error formula for r
SEr = √(1 − r²)/(√n); used to compute t = r√(n − 2)/√(1 − r²) for significance testing of correlation.