1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
individuals
unique people, animals, or things in data set
variable
characteristic that changes from one individual to another
categorical variable
takes on values that the are category names or group labels
quantitative variable
takes on numerical values for a measured or counted quantity
categorical data
values of a categorical variable in a data set
quantitative data
values of a quantitative variable in a data set
Not all variables with # values are…
quantitative (ex: zip code)
frequency table
gives # of individuals(cases) in each category
A frequency table helps represent
distribution of categorical variable in tabular form
relative frequency table
gives proportion or % of individuals (cases) in each category
majority
more than 50%
Bar charts for categorial data
Var name on horizontal axis; freq on vertical
Scale axes (start at 0) + equal increments
bars should be equal in width and LEAVE GAPS between them
Different sample sizes of 2 groups makes it hard to compare the distribution but comparison is easier if you calculate…
relative frequencies within each group in answer choices involve %s
What are the 2 types of quantitative variables?
discrete variable
continuous variable
discrete variable
can take on a countable # of values (with gaps, counted)
continuous variable
can take on infinitely many values, but they cannot be counted (no gaps, measured)
Histogram
Use intervals of values and figure out how many #s you have in a certain interval and then graph it with a bar
values of endpoints should be put into the var at the right side of them
easier to make for larger data sets
How to describe a distribution of quantitative data?
Shape
Center
Variability (Spread)
Unusual features
Shape
Symmetric: both sides are mirror images of each other
Skewed left: the left side has a “tail”
Skewed right: the right side has a “tail
Unimodal: there is a singular peak
Bimodal: there are two peaks
Uniform: most values are the same
Unusual features
Outliers
Gaps + Clusters
first quartile (Q1)
the median of the 1st half of data set
third quartile (Q3)
the median of the 2nd half of data set
DO NOT include the median in the Q1 or Q3 set of values when trying to figure out the…
Q1 or Q3
Interquartile range
Q3-Q1
Standard deviation
typical distance that each value is away from the mean
Equation of SD!
Sum of all values minus the mean squared divided by the # of values - 1. Lastly sqr is taken of everything.
Square of SD is…
variance
Interpreting IQR
the middle 50% of the values for _____ has a range of ____(IQR)
Interpreting SD
the _______ from each sample typically varies by about ______ (SD) from the mean of _____ (Mean)
Position
Q1 and Q3
Method 1 for Outliers
Low Outlier < Q1 - 1.5IQR
High Outlier > Q3 + 1.5IQR
Method 2 for Outliers
Low Outlier < mean - 2SD
High Outlier > mean + 2SD
Non-resistant stats
Mean, SD, range are highly influenced by removing outliers
Resistant stats
Median and IQR are not greatly affected
For skewed distribution use…
median for center and IQR for variability
For symmetric distribution use…
mean for center and SD for variability
Five # summary
min, Q1, median, Q3, max
Skewed right
mean > median
Skewed left
mean < median
Symmetric
mean = median
Percentile
percent of data values less than or equal to a given value
Interpret
the value of _____ is at the pth percentile about (p)% of the values are less than or equal to _____
z-score (standardized score)
(data value - mean)/SD
Interpret z-score
the value of ______ is _______ (z-score) standard devs above/below the mean
Z-score and percentile can be calculated for distributions with…
any shape
z score positive =
value above mean
z score negative =
value below mean
Normal distribution
mound-shape (bell-curve) and symmetric
Many quantitative values can be modeled by a…
normal distribution
Normal distributions are determined by…
mean and SD
Empirical/68-95-99.7 Rule
the SDs of mean from 1-3 are 68%, 95%, and 99.7%
Area to left of Normal distribution
Take value at x and find z-score of it
Use z-score take to calculate probability
Area to right of Normal distribution
1- area to left of norm. dist.