1/20
These flashcards cover key vocabulary related to data exploration, statistics, and analysis techniques.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Outlier
Any value that falls more than 1.5IQR above Q3 or below Q1, or more than 2 standard deviations away from the mean.
Standard Deviation (SD)
A measure that gives the typical distance that the values are away from the mean.
IQR
Interquartile Range, the difference between the third (Q3) and first (Q1) quartiles.
SOCV
An acronym for Shape, Outliers, Center, Variability - used to describe distributions.
Scatterplot
A graph used to describe the relationship between two quantitative variables.
Percentile
The percentage of values in a data set that are less than or equal to a particular value.
Least Squares Regression Line (LSRL)
A line that minimizes the sum of the squares of the residuals between observed and predicted values.
Correlation Coefficient (r)
A measure that quantifies the strength and direction of a linear relationship between two quantitative variables.
High-Leverage Point
A point with a substantially larger or smaller x-value compared to other observations.
Influential Point
A data point that significantly changes the slope or intercept of the regression line when removed.
Discrete Variable
A variable that can take on a countable number of values, often finite.
Continuous Variable
A variable that can take on infinitely many values, usually measured rather than counted.
Coefficient of Determination
The percentage of the variation in the dependent variable explained by the regression line.
Y-Intercept
The predicted value of the dependent variable when the independent variable is zero.
Residual
The difference between an observed value and the predicted value from a regression model.
Skewed Distribution
A distribution that is not symmetrical, with values extending more to one side than the other.
Mean
The average of a data set, calculated by summing all values and dividing by the total number of values.
Median
The middle value of a data set when ordered from least to greatest.
Distribution Comparison
The process of analyzing and discussing the similarities and differences in center and variability between two data distributions.
Linear Relationship
A relationship between two variables that can be represented by a straight line on a graph.
Unusual Values
Data points that significantly differ from other observations, often referred to as outliers.