1/59
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Population
All the individual items that could be studied
Sample
A selection of items from the population
Subjects/sample
Individual items in a sample
Observations
Data collected from the subjects/sample
Variables
The differences between subjects you may want to study
Data
Numerical information used to interpret and extract meaning from
Quantitative
Information that possesses a directly measurable (numerical) value
Qualitative data
Information/data that in non-numerical in nature. Usually in the form of a descriptive. Sometimes converted to quantitative data for analysis
Categorical data
Qualitative data which fits into named categories with no in between categories (like blood group)
Nominal data
Qualitative data with no ordering to the categories (e.g. species of snail)
Ordinal data
Qualitative data which has ordering to the data (e.g. sporting scores, chilli hotness score) - ordering not mathematically linked
Discrete data
Quantitative data that can only take certain values (no half people in a room)
Continuous data
Qualitative data which can take any value within a given range
Distribution
Way of describing the way a dataset looks
3 ways of measuring central tendency
mean
Median
Mode
Mean
Sum of all the values divided by the total number of values
What is mean represented by (population mean)
Mu (Greek letter)
What is mean represented by (sample from larger pop)
X bar
When is the median used
When the data is skewed - mean not reliable
Median
The point at which half of the data points are above and half are below
What happen to the median if there are an even number of values in the sample
The median is the mean of the two middle values
Resistant
Median is not affected by extreme values
Mode
Most frequently occurring event
When is the mode most useful
Nominal variables (e.g. eye color) where mean and median are not useful
Name of 2 modes
Bimodal
Name for more that 2 modes
Multimodal
Why isnt central tendancy enough
2 data sets may have the same means but completely different dispersion
Range
Max-min
How is variance represented
S²
what limits the usefullness of the range
strongly affected by outliers
what is the IQR
Q1-Q3
what does the cental box of a boxplot represent
IQR
what does the line in the box of the boxplot represent
median
what to the whiskers of a boxplot represent
the maximum and minimum
how is an observation determined to be an outlier
if it is more than 1.5 x IQR above the 3rd quartile or below the 1st quartile
variance definition
measurement of how far each number in a data set is from the mean and from every other number in the set
overall degree that data points differ from the mean
what can the variance tell us
the spread/dispersion
steps to calculate variance
find difference between each value and the mean
square the differences
add the differences together
divide by the number of data points minus 1 (if working with a sample - if working with the population, divide by number of data points)
why do we quare the deviations when calculating variance
to account for negative numbers
what is the most commonly used measure of spread
standard deviation
when is standard deviation useful
as a measure of dispersion of NORMALLY DISTRIBUTED data
what is the 68-95-99.7 rule
no matter what the standard deviation and the mean are, the area between them is about 68%, the area between the mean ± 2S.D is 95%, the area between the mean ±3S.D is 99.7 percent
nearly all values/observations fall ..?
within 2 standard deviations of the mean
what can you use S.D for
to determine if your data are normally distributed
how to use S.D to determine if data are normally distributed
if the value of 2S.D ± mean is clearly impossible the data are NOT normally distributed. only if sample size is 50+
symbol for standard deviation
sigma
symbol for population variance
sigma squared
symbol for sample variance
s squared
symbol for population mean
mu
symbol for sample mean
x bar
description of normal distribution
bell curve
centered on mean
most of data clustered around center
in perfect normal distribution, mean, median and mode would be the same
kurtosis definition
measure of how wide or narrow the tails of distribution are
mesokurtic
medium tailed
medium outlier freq
moderate kurtosis
normal distribution
platykurtic
thin tailed
low outlier freq
low kurtosis
uniform distribution
leptokurtic
fat tailed
high outlier freq
high kurtosis
laplace distrubution
z score definition
the number of standard deviation units than an observation is away from the population mean
positive vs negative z score
if an observation has a value above pop. mean = +1sd = z score of 1
if its -1sd = z score of -1
formula for z score
deviation from the mean divided by the standard deviation
slide 15 week 14