1/33
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Statistics
the science and art of collecting, analyzing, and drawing conclusions from data
Individual
an object described by a set of data (e.g. people)
Variable
any characteristic of an individual; an attribute that can take different values for different individuals
Categorical Variable
assigns labels that place each individual into a particular group, called a category
Quantitative Variable
takes number values that are quantities, when it makes sense to find an average
Distribution
the distribution of a variable tells us what values the variable takes and how often it takes those values; the pattern of variation of a variable (you should calculate the distribution of the variable that you WANT to predict for each value of the other variable)
Frequency Table
shows the number of individuals having each value
Relative Frequency Table
shows the proportion or percent of individuals having each value
Bar Graph
shows each category as a bar (shows category/relative frequencies)
Pie Chart
shows each category as a slice of the “pie” (%)
Two-Way Table
a table of counts that summarizes data on the relationship b/w 2 categorical variables for some group of individuals
Marginal Relative Frequency
the percent or proportion of individuals that have a specific value for one categorical variables (only shows one variable out of a two-way table)
A/Total
Joint Relative Frequency
Frequency the percent or proportion of individuals that have a specific value for 1 categorical variables and a specific value for another
(A + B)/Total
Conditional Relative Frequency
the percent or proportion of individuals that have a specific value for one categorical variables among individuals who share the same value of another categorical variables (the condition)
(A + B)/B
Side-By-Side Bar Graph
displays the distribution of a categorical variables for each value of another categorical variable (they are grouped together based on the values and placed side by side)
Segmented Bar Graph
displays the distribution of categorical variables as segments of a rectangle, with the area of each segment proportional to the percent of individuals in the category
Association (b/w 2 variables)
this happens when you know that the value of one variable will help us predict the value of the other (association does NOT imply causation [beware of other variables])
Dotplot
shows each data value as a dot above its location on a number line
Symmetric Distribution (dotplot)
when the right side of the graph is approximately the mirror image of the left
Skewed Distribution (dotplot)
when one side of the graph is much longer than the other side (left skewed [more on right side] or right skewed [more on left side])
Variability
how much the data varies (“The data vary from [min value] to [max value].”)
Stemplot
shows each data value seperated into 2 parts: a stem, which consists of all but the final digit (10-…s place), and a leaf, the final digit (1s place) (leaves are arranged in increasing order)
Histogram
shows each interval of values as a bar (used forlarger data sets) (looks like a bar graph but bars are touching) (instead of one bar being a singular value, it is a range of values (e.g. 70-75))
Mean
the average of all the individual data values (μ)
Resistant
a statistical measure is resistant if itis NOT sensitive to extreme values (mean =/ resistant) (median == resistant)
Median
the midpoint of a distribution
Range
the distance between the Eminem and Max value of a distribution (a single number) (max - min)
Standard Deviation
measures the typical distance of the values in a distribution from the mean
Quartiles
divides the ordereddata set into 4 groups, having roughly the same number of values (arrange the data values from smallest to largest, and find the median)
Q1 - the median of the data values to the left of the median in the ordered list
Q3 - same as Q1, but to the right
Interquartile Range (IQR)
the distance between the first adn third quartiles of a disribution
Five-Number Summary
the minimum, Q1, median, Q3, and maximum of a distribution
Boxplot
a visual representation of the five-number summary
Interpret a Standard Deviation
The [context] typically varies by [Sx] from the mean of [x(mean)].
Interpret the IQR
The range of the middle half of [context] (in the sample) is [IQR].