stats

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/58

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

59 Terms

1
New cards

Mann-Whitney U Test

Nonparametric test used for two independent samples; equivalent to an unpaired t-test.

2
New cards

Wilcoxon Signed-Rank Test

Nonparametric test used for two dependent samples (paired data); equivalent to a paired t-test.

3
New cards

Nonparametric tests underlying procedure

Rank the data, and use these ranks to perform the test, rather than the raw data itself.

4
New cards

Hypotheses for Wilcoxon signed-rank test

H0: the median difference is 0; H1: the median difference is not 0.

5
New cards

Corrected R code to install nortest package

install.packages("nortest")

6
New cards

Purpose of lillie.test(Differences) in R

Test of normality on 'Differences'.

7
New cards

Bootstrapping Method (Confidence Intervals)

Used when the sample size is small (n < 30) and the underlying distribution is not normal.

8
New cards

Default Resamples in Bootstrap Confidence Intervals

1000

9
New cards

R code for 99% BCa and Percentile Bootstrap Confidence Intervals

boot.ci(name of bootstrap output, conf=0.99, type=c(“bca”,“perc”)

10
New cards

95% Bootstrap Confidence Interval (using normal approximation)

x̄ ± 1.96 * standard error

11
New cards

99% Bootstrap Confidence Interval (using normal approximation)

x̄ ± 2.58 * standard error

12
New cards

Mann-Whitney U Test

A test used to compare two independent groups when the sample size is small and the data is not normally distributed.

13
New cards

Null Hypothesis (H0) in Mann-Whitney U Test

The medians of the two groups are equal.

14
New cards

Alternative Hypothesis (H1) in Mann-Whitney U Test

The medians of the two groups are not equal.

15
New cards

U Statistic Calculation

U1 = R1 − n1(n1 + 1) / 2 and U2 = R2 − n2(n2 + 1) / 2 , where U = Min{U1, U2}

16
New cards

Decision Rule for Mann-Whitney U Test (Reject H0)

Reject H0 if U < Ucrit (critical value)

17
New cards

When do we use the bootstrapping method to create confidence intervals

Bootstrap confidence intervals are used when the sample is small(n < 30) and the underlying distribution is not normal.

18
New cards

Non parametric equivalent of one-way repeated ANOVA

Friedman Test

19
New cards

Non parametric equivalent of one-way ANOVA

Kruskal-Wallis H Test

20
New cards

Non parametric equivalent of paired t-test

Wilcoxon signed rank test

21
New cards

Non parametric equivalent of unpaired t-test

Mann-Whitney U test

22
New cards

What conditions need to be broken to use Mann-Whitney U instead of unpaired t-testt

-normality of samples

-homogeneity of variances

23
New cards

When to use Mann-Whitney U test

-Ordinal or Continuous data

-2 independent groups

-test of difference

24
New cards

What assumptions need to be broken to use Wilcoxon signed rank instead of paired t-test

-assumption of normality of the differences of paired data is not met

- if we have ordinal data

25
New cards

what assumptions need to be broken to use Kruskal-wallis H instead of one-way ANOVA

-normality of residuals

-homogeneity of variances

26
New cards

What assumptions need to be broken to use Friedman test instead of one-way repeated ANOVA.

-normality of residuals

- sphericity

27
New cards

When to use Kruskal-Wallis test?

-Extension of Mann-Whitney U

-3 or more independent groups

28
New cards

when to use wilcoxon signed rank test

-ordinal or continuous data

-Distribution of the differences must be symmetric

-2 dependent groups

-test of difference

29
New cards

when to use Friedman test

-Extension of Wilcoxon Signed rank

-3 or more dependent groups

30
New cards

How to use the Wilcoxon signed rank test

-Work out differences

-put differences in order ignoring 0 and signs

-rank differences sorting out tied ranks

-put signs in a column

-add together all the T- and T+

-Minimum of (T-,T+) is test statistic

-compare to T crit value from tables (n is no. values without 0 differences and 0.05 is halved to be alpha)

-T

31
New cards

How to use the Mann-Whitney U test

-Put data in order

-Work out ranks sorting out tied ranks

-sum up the ranks for each group

-then calculate U1 and U2 (given fromulae)

-Minimum of (U1,U2) is test statistic

-Compare to U crit value from tables with half sig level.

-U

32
New cards

How to use Kruskal Wallis H test?

-Put data in order

-Work out ranks sorting out tied ranks

-Sum up the ranks for each group

-Work out n for each group

-Work out H (formula given)

-Work out H corrected if needed by H/CH

-Use χ2 distribution tables with k-1 degrees of freedom

-If H/Hcorrected ≥ χ2crit we reject H0

33
New cards

How to use Friedman test?

-Count number of groups (not including the participants)

-Rank each participant across their own groups. (e.g if there is 3 groups then 1 -3 across each individuals scores across their groups) (Ranking across not down like in other tests)

-Account for tied ranks

-Sum each column

-Calculate F (formula given)

-For k=3 and n=2,3,...,9 or k=4 and n=2,3,4 we evaluate the test statistic by comparing it with values from Friedman tables.

- If F/Fcorrected ≥ Fcrit we reject H0

-Otherwise, we use tables of the χ2 distribution with k − 1 degrees of freedom.

- If F/Fcorrected ≥ χ2crit we reject H0

34
New cards

How to calculate E(X)?

Add together degrees of freedom or just the bottom number.

35
New cards

How to calculate Var(X)

Multiply little numbers next to χ together

36
New cards

How to calculate SD(X)

square root var(X)

37
New cards

When to use chi-square goodness of fit test

-no expected frequencies should be less than 5

-Each participant can only contribute to one category or cell in the frequency table, therefore we have independence

38
New cards

How to use chi-square goodness of fit test

-Work out each observed - expected

-Square the observed-expected

-Divided these squared values by the plain expected values

-The test statistic is the divided values added together

-Compare to value in chi squared tables using critical value as is and k-1 df (k is number of categories).

-Table value > test statistic means Accept H0.

39
New cards

When to use χ2-Test Contingency Tables

-test of relationship/association

-no expected frequency should be less than 5

-Each participant can only contribute to one category or cell in the frequency table, therefore we have independence

40
New cards

How to use χ2-Test Contingency Tables

-total up every row and column

-Multiply each total together and divide by the whole total to get every position in the table (not the same as the actual values in the table)

-Then use the formula given to find the test statistic (this formula y is the value in the table and y~ is the value you just worked out)

-df is (r − 1)(c − 1)

-compare to the χ2 table value with the critical value as is

-Table value>test statistic means accept H0.

41
New cards

When to use Yates continuity correction

-when the degrees of freedom is 1

42
New cards

How to use Yates continuity correction

-change the formula used in χ2-Test Contingency Tables to (|y −y ̃|−0.5)^2 /y ̃

(-0.5 is the difference)

43
New cards

When to use Phi

-to test Effect Size/Strength of Association

-for 2 × 2 contingency tables

44
New cards

When to use Cramer's V

-to test Effect Size/Strength of Association

-two categorical variables when each variable has two or more categories

45
New cards

When to use Odds Ratio

-to find the ratio of the odds that an outcome will occur in group 1 and the odds of the outcome occurring in group 2

-2 × 2 contingency tables

46
New cards

How to use Phi

φ=sq( χ2/n)

if φ=0.1 SMALL

φ=0.3 MEDIUM

φ=0.5 LARGE

47
New cards

How to use Cramer's V

V=sq( χ2/ n x dvf)

where dfv = min(c−1,r−1)

48
New cards

How to use odds ratio

in relation to the values in the table

OR= ((1,1)/(1,2)) / ((2,1)/(2,2))

We evaluate odds ratios in the following way

• OR = 1: belonging to group 1 has not affected the odds of outcome A;

• OR > 1: belonging to group 1 has increased the odds of outcome A;

• OR < 1: belonging to group 1 has decreased the odds of outcome A

49
New cards

When to use the poisson distribution

-the Poisson distribution is typically used in situations where we count events

-also used for claim frequency

(Predicting the number....)

50
New cards

When to use the Logistic regression

-To predict Binary outcome from a linear combination of independent variables

-also used in insurance to calculate the prosperity to claim

-predict the probability of......

51
New cards

What is the canonical link function of logistic regression

logit which is log(odds)

52
New cards

when to use gamma regression

-used to predict a gamma distributed outcome from a linear combination of independent variables

-eg predicting the size of an insurance claim based on the age of driver.

53
New cards

what is the canonical link function of gamma regression

reciprocal function which is 1/λ

54
New cards

what is the canonical link function of the poisson

log link function. which is log(µ)

55
New cards

what is the canonical link function of the normal distribution

identify link function

56
New cards

When to reject H0

-For MWU and WSR table>test means reject

-For all other tests table

57
New cards

What is the key difference between the binomial distribution and the hypergeometric distribution?

Binomial is sampling with replacement.

Hypergeometric is sampling without replacement

58
New cards

Under what assumption is the null hypothesis for these tests that all group medians are equal? Why is this preferred and how would we check this?

The assumption is that the distributions of the groups are of the same shape. This is preferred as it is more in keeping as an alternative method to ANOVA (comparing means) and it provides more useful interpretations. We could check this using histograms and/or boxplots

59
New cards

pca or fa

  • PCA is used to simply reduce the observed variables into a smaller set of components.

  • FA is used when there are suspected latent variables/factors.