1/33
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
frequency distribution
a tabular summary of a variable’s values
commonly used in data presentations of all kinds
survey research and journalistic polls to marketing studies and corporate annual reports
raw frequency
count of individuals at each of the variable’s values
total frequency
total at the bottom of the column
what kind of charts are frequency distributions often in the form of?
bar chart
what do charts help us with?
help us communicate the most important features of data effectively
what kind of charts do we wanna avoid?
pie charts
the greatest amount of dispersion occurs when _____
cases are equally spread among all values of the variable
proportion
raw frequency divided by the total frequency
percentage
proportion multiplied by 100
how do we organize a frequency distribution table with ordinal values?
the order of the rows in a frequency distribution table of bars in a bar chart must be consistent with the relative rank of a variable’s values
cumulative percentage
records the percentage of cases at or below any given value of the variable
what can we do with the cumulative percentage?
we can locate the median, the value of the variable that (as closely as possible) divides the cases into two equal-sized groups
percentile
reports the percentage of cases in a distribution that lie below it
serves to locate the position of an individual value relative to all other values
quantile
specific value in a data distribution that divides the data into equal portions or groups
ex: quartiles, which divides data into four equal groups
bimodal distribution
frequency distribution having two different values that are heavily populated with cases
two modes are separated by more than one nonmodal category
we would not want to use a single mode to describe the central tendency of this distribution
non-parametric statistics
analyzing non-numerical data
provide robust and flexible tools for data analysis
work w/ any type of data, but less powerful than parametric statistics
work without making strong assumptions about the data and are often used when the data don’t follow specific patterns or distributions
parametric stats
describe and summarize quantitative data
based on assumptions about the distribution of the variable’s values
what columns does frequency distributions have?
frequencies, percentages, and cumulative percentages
histograms
another method of graphing the distribution of an interval-level variable with many unique values
shows the percentage or frequency of cases falling into intervals of the variable
density plots
alternative to histograms for visualizing the distribution of an interval-level variable
display a “running average” of observations across the range of observed values
allow researchers to “zoom in” to show greater detail and “zoom out” to reveal general patterns in data
another way to describe the median
50th percentile value; the value that divides observations into equal-sized groups
name the three ways to describe the dispersion of an interval variable
range, interquartile range, and standard deviation
range
the maximum actual value minus the minimum actual value (largest - smallest)
interquartile range (IR)
the range of a variable’s values that defines the “middle half” of a distribution—the range between the upper boundary of the lowest quartile (which is the same as the 25th percentile) and the lower boundary of the upper quartile (the 75th percentile)
standard deviation
summarizes the extent to which the cases in an interval-level distribution fall on or close to the mean of the distribution
in gauging variation in interval-level variables, standard deviation is the measure of choice
measures the typical amount of deviation of a variable’s values from its mean value
although it is a more precise measure of dispersion than those applied to nominal and ordinal variables, standard deviation is based on the same general principles
how to calculate the standard deviation
calculate each value’s deviation from the mean
square each deviation
sum the squared deviations
divide the sum of the squared deviations by n - 1 to find the variance
take the square root of the variance
skewness
measure of symmetry: the more skewed the distribution, the less symmetrical it is
can be a positive or negative number
distributions with a longer, or skinnier, right-hand tail have a positive skew
those with a skinnier left-hand tail have a negative skew
kurtosis
measures the shape of a distribution, specifically how much it deviates from a bell-curve distribution
provides information about the tails and peaks of a distribution and the number of extreme values observed
always a positive number
leptokurtic
if a variable’s kurtosis > 3
there are more values in the tails of the distribution, which indicates greater variability and the potential for more rare or extreme events
mesokurtic
if a variable’s kurtosis = 3
distribution closely resembles a bell-shaped curve with a moderate amount of variability
platykurtic
if the variable’s kurtosis < 3
distribution has a relatively flat peak and light tails, suggesting less variability and fewer extreme values
excess kurtosis
kurtosis - 3
make it easier to classify the distribution as leptokurtic, mesokurtic, or platykurtic
box plot
communicates a five-number summary of a variable:
minimum value, lower quartile (25th percentile), median, upper quartile (75th percentile), and maximum value
resistant measure of central tendency
extreme values may have an obvious effect on the mean, but they have little effect on the median
the median is impervious to the amount of variation in a variable
the median reports the value that divides the respondents into equal-sized groups, unfazed by the distribution’s skew