1/76
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
independent variables influence the
dependent
categorical variables
assigned numerals representing categories, limited in what you can do with them
2 types of categorical variables
nominal and ordinal
nominal
2 or more unranked categories
ordinal
2 or more ranked categories
numeric variables
numbers represent an amount/quantity and not a ranking
2 types of numeric variables
discrete and continuous
discrete
only takes on specific values within a given range, countable (number of falls, bp, gait speed)
continuous
scores can occur along continuum in theory (distance, ROM, strength)
what is continuous constrained by
precision of measuring instrument (cant do ms)
parametric testing
assumes sample data is normally distributed and bell shaped, more powerful and prefered
parametric testing 2 requirements
quantitative data and has normal distribution
nonparametric testing
no assumption about normality
what is nonparametric testing done with
categorical data, quantitative data that is not normal
visual checks for distribution/normality
histograms, stem and leaf, QQ plots
numeric checks for distribution/normality
frequency tables, skewness and kurtosis, KS, SW
normal distribution def
most scores are in middle, often assessed with a histogram
histogram
graph with observation on X axis and times of value on Y
skew
distribution is asymmetrical, normality is not met
positive skew
right skew, scores are bunched at low values
negative skew
left skew, scores are bunched at high values
kurtosis def
refers to peakedness and degree to which scores cluster at tails
positive kurtosis
too peaked, long tails
negative kurtosis
too flat, short tails
QQ plot
plots quantiles of variable vs quantiles of theoretical distribution
QQ plot y axis
expected info
QQ plot x axis
observed
normality of QQ plot is shown if
data fall along straight line on plot
KS and SW tests are for
testing if distribution significantly deviates from normal distribution
If p is less than 0.05
data is significantly different from normal, NOT normally distributed
if p is more than 0.05
data is not significantly different from normal, it is normally distributed
for small samples under 50 use
SW
for large samples over 50 use
SW or KS
3 things for central tendency
mean, median, mode
mode
most frequent score, summarizes categorical data well
median
middle score when you order data, useful with ordinal data or quant data
what is prefered for skewed data/outliers
median
mean
average score, stable in samples
downside of mean
not good for categorical data or quant data that is skewed or has outliers
if data has outliers, which is more reliable
median
5 measures of variability/dispersion
range, quantiles, variance, SD, coefficient of variation
Range
largest minus smallest
quantiles
splitting data into parts
-quartiles or percents
quartiles
3 values that split data into 4 even parts
lower (1st) quartile
median of lower hald
2nd quartile
median
3rd quartile (upper)
median of upper half
interquartile range
upper minus lower
deviance
how different each score is from center
sum of squares indicates
total dispersion, total deviance scores from mean
standard deviation
square root of variance
coefficient of variation (CV)
ratio of standard deviation to mean expressed as pecentage
CV formula
CV=100 X (SD/mean)
why does coefficient of variation not use units
helpful for comparing distributions from different samples that may have different means or units
Lower CV indicates
less relative variation and more stability
2 things for statistical inference (prediction)
probaility, sampling error
probability
likelihood an event will occur given all possible outcomes
sampling error
extent to which a statistic varies in samples taken from same population
z score
expresses a scrore in terms of how many SD it is away from mean
formula for z score
Z=X-X/S
5% of z scores lie where
beyond -1.96 and +1.96 (2.5 % on each side
smaller sampling error means
less difference between sample and population mean
standard error of mean (SEM)
SD of distribution of sampling distribution
SEM is an indicator of
how close sample mean is to the true population mean (lower value suggests that sample deviates less)
SEM calculation
S/squar root of n (population)
As population increases, what happens to SEM
gets smaller, which is good
confidence intervals
range within which we believe the true population parameter lies
95% CI means
we are 95% confident that the population mean lies within this range
CI calculation
CI=+/- z x SEM
null hypothesis (HO) means
theres no difference
alternative hypothesis (HA) means
there is a difference
Statistical tests are based on which hypothesis
null, rejecting or failing to reject it
if p is less than 0.05
reject HO and accept HA
-different exists
if p is greater than 5
fail to reject HO
-no difference
type I errors
reject HO when HO is true
-would make you do opposite of whats true
-saying theres a difference when there may not be
type II errors
fail to reject HO when HO is false
-say theres no difference but there is
power
probability of finding a statistical difference when one exists