1/10
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Parametric vs non-parametric tests (contrast)
parametric :
known specifc distribution - ie Normal
big samples
specific parameters
usually quantitative data used
non-para :
dont know dist
small sample size
qualitative data
dont require specific parameters
tests rely on ranks , signs , frequencies
considering relative position
2 types of data (make a tree , whats defining factor of ratio or interval quant data, differences btwn the 2 , give 2 examples )
Quantitative - diffrence?- consider if theres a zero point
Ratio : meaningful differences + a true 0 → Height , weight, temp in K
Interval: meaningful differences (no true 0) → time, temp *C, IQ score
Qualitative
Nominal : just categories , no order → make of car, gender, a month name
Ordinal : categories w a order, relative rank → education level (UG,PG,PhD), grade symbol (A,B…)
Non-para test we look at
Wilcoxon signed rank sum
Mann-Whitneyy-Wilcoxon test
Kruskal Wallace
Friedmann
Spearman’s rankcor
Ranking data
R formula = rank
order data low to high (unless otherwise specified)
assign ranks according to rel position of data
Ties ? - Yes : take avg of ranks
Wilcox signed rank sum test (when we use, what r we testing, null & alt hypothesis, data used,assumptions
when : comparing 2 matched (related in some way) quantitative samples
what we test : median of differences
Hypotheses
H0: median of diffs = 0
H1 : median of diffs /= , >, < 0
Data we look at:
2 paired samples
Quantitative interval OR ratio data
Assumptions:
under H0 : pop of differences in each grp symmetric around median
the n paired diffs independent & random samples
Calculating test stat ( whats W + whats it measuring + overall what r we investigating, whats it mean when W near 0 , what does it mean & what do we do when n>10
calc diff for each pair
ignore pairs w diff = 0
n = no. of non-zero diffs
sign(±) of paired diffs
Sort (small to big) abs val of the diffs
Put sign back
Calc W (test stat) = sum of signed ranks
Note : W close to 0 means no diff
wanna see if theres a diff btwn our matched samples
W is measuring if the signs across the ranks are randomnly distributed or theres a pattern
When sample size >10:
sampling dist of W normal
1. calc test stat z= ( W - miu )/ std dev (sigma) as normal
2. do hyp test as normal
3. p-value → pnorm(test stat,lower.tail = F )
Mann-Whitney-Wilcoxon test
objective : see if 2 independent samples of ordinal or quantitative data have same median
Assumptions :
2 random saample sizes n1 & n2
data quantitative or ordinal
samples + obs in each sample independent
only differ in median (if the median is diff)
H0 : pop medians same
H1 : pop means diff , 1st median to the right/left of 2nd ie bigger or smaller
test stat
combine data set to one
rank all obs big → small
sum of ranks (note which sample each obs comes from)→ T1 T2
small sample (n1 and or n2 <10:
T = T1
Critical region : TU = n1(n1 + n2+1) - TL
Rejject H0 if : T <= TL or T>= TU
Big sample:
T = normal dist
calc z score (test stat) : z= T-miuT/sigmaT miuT= n1(n1 + n2+1)/s sigmaT= root (n1n2(n1 + n2+1)/12)
Test stat → qnorm(alpha,lower.tail = F / T )
p value → pnorm(teststat,lower.tail = F)
Reject H0 if T <= TL or T >= TU (same)
Kruskal - Wallace test
when? - compare 2+ independent grps/samples ordinal or quantitative data by medians (ie see if same pop)
what we test: medians across groups
Hypotheses:
H0: all population medians are equal
H1: at least two population medians different
Data we look at:
ni >= 3 (at least 3 obs in each sample)
Quantitative or ordinal data
Assumptions:
treatment lvls and obs within treatment lvls are independent and random
distribution of scores in each group is the same shape
Test stat :
Combine obs from all grps to one sample
Rank small to big
Avg the ranks of tied obs
Calc sum of ranks T1 → TK
compute H :
Critical region : H folows chi squared dist , K-1 df , 1 sided upper tail test
Reject H0 : H >= p val
R : kruskal.test(formula = num ~ name, data = kw1
kruskal.test(formula = num ~ name, data = kw2
Friedmann Test
When? - compare 2+ independent grps/samples ordinal or quantitative data using matched or blocked samples ,by medians (ie see if same pop)
Data
ordinal or quantitative
data from blocked experiment w b blocks (similar exp units grped )
k = no. of treatments
b = no. of blocks
Assumptions
measurements within block dependent
measurements from diff blocks independent
No interaction btwn blocks + treatments (whats administered to exp units - pops/variables being compared in test)
Friedmann test
H0: all population medians are equal
H1: at least two population medians different
1.Rank small to big
Avg the ranks of tied obs
Calc sum of ranks T1 → TK
Test stat ( Fr) : if k/b >= 5 ; chi squared dist K-1 df
Reject H0 : Fr >= critical val OR p-value < alpha
Spearmann rank correlation coefficient test
when? - measure association btwn 2 samples/variables of quantitative or ordinal data
Data
n randomnly chosen paired observations
n = no. pairs in the data
n >= 10 → normal dist
Assumptions
both variables at least ordinal or quabtitative & at least 1 variable not normal
Spearmann test
H0 : ps = 0 (no associtaion between the 2 vars in underlying pop)
H1 : ps =/ 0 ( is association) , > 0( correlation is positive) , <0 (correlation is negative)
Test stat :
rank pops X & Y separately
calc difference d within each pair of ranks , d = rank(xi) - rank (yi)
(large - normal dist - sample) : z = rs * root(n-1)
Reject H0 : p-value <= alpha
p- value for norm dist = pnorm()
Advantage of non-para tests - 4 & Disadvange
Advantages
useful when assumptions of para test uncertain
useful when n small
few assumptions
not restricted to quantitative data
Disadvantage
Info gets lost thru ranking & taking signs → less power compared to parametric tests