stats and research methods final

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/238

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

239 Terms

1
New cards

why do we need descriptive stats

  • to interpret results

  • facilitates predictions from patterns

  • describes information (communication)

  • real world events are probalistic (based on probability)

2
New cards

variability is a _____, not an exception

rule

  • we use distributions to map out different outcomes happen, and use these to determine probability

3
New cards

how do we get probability curves?

measure samples from the same population, and use that data to create samples

  • many things happen in a normal distribution, and we can use that to make predictions

4
New cards

central tendency

a center, representative value

  • median (middle number)

  • mean(average)

  • mode(recurring)

5
New cards

central tendencies on a slanted curve

mode at high end, median in middle, mean at bottom

6
New cards

variance

how spread out and different the data is

7
New cards

sd is used to represent

variance

8
New cards

how does a sample size impact variance

big sample size → less variance

9
New cards

ecological validity and sd

high ecological validity lowers SD (less variance)

10
New cards

bivariate stats

the relation between continuous or categorical variables

11
New cards

pearson’s coefficient

describes the direction and strength of a relationship between variables, offers a shortcut to describing data

12
New cards

issues with descriptive stats

  • interpretation can be complicated

    • data can change but have the same range and central tendency

    • in these situations, box plots don’t work anymore (violin plots better)

  • so, we need something else

13
New cards

why do we normalize data

it is often challenging and/or misleading to use raw data to make predictions

  • for ex, hard to find where the 95% prob cut off lines are

14
New cards

normalizing data

  • transferring data (through linear relations) to have a mean of 0 and SD of 1 (x-axis)

  • easier to communicate info

    • cutoff for 95% confidence is [-1.96,1.96]

15
New cards

goal in understanding data

we aim to understand the population, but often population parameters are unknown/unknowable

  • so, we collect samples and use them to make educated inferences about the population

16
New cards

how do we normally estimate>

we estimate within a range, rather than pinning down a single value

  • inferential stats predicts a range of an interval, that we hope (to a high probability) is correct

17
New cards

confidence interval - theoretically

we are finding a range that is very likely to contain the true population parameter (often the mean)

  • uses a sample statistic (sample mean) as a point estimate

  • incorporates sampling variability (through SEM and a critical value [often a t-critical value])

18
New cards

how we build our confidence interval

use sample stats:

  • sample mean (assuming its close to Mu)

  • standard error of the mean (from sample variance and sample size)

  • use on our sampling distribution of sample means

Mu ± 1.96 (z critical value) * ( σ/(n) )

19
New cards

central limit theorem

the shape of the sampling distribution of sample means

  • with a sample size big enough, the distribution is approximately normal

    • even if the population distribution is non-normal

20
New cards

parameters of central limit theorem

mean = Mu

standard deviation = σ/(n) = SEM

21
New cards

Standard Error of the Mean

The SEM quantifies the typical distance a sample mean is likely to be from the actual population mean

  • a smaller SEM indicates a more precise estimate of the population mean

22
New cards

how can we calculate CI with only sample statistics?

Law of Large Numbers - as sample size increases, sample parameters will get closer to population parameters

  • so, the sample mean (x-bar) will become closer to the population mean (Mu)

  • we use the standard deviation from a single sample (s) to estimate population standard deviation (σ)

    • we have to do this, since we often don’t know the parameters of the population

23
New cards

symbols and what they mean

x-bar - sample mean

Mu - population mean

σ - population SD

s - sample SD

n - sample total

24
New cards

in CI calculations, when would we replace the z-critical values with t-based critical values?

when n is small

x-bar ± t critical values * SEM: σ/(n)

25
New cards

as n increases, what happens to range of sampling distribution of sample means

as n increases → range decreases → CI range decreases

  • this means our statistics are becoming more accurate

26
New cards

how does n size impact SEM

n increases → range becomes more narrow → SEM gets smaller

27
New cards

degrees of freedom

how many independent pieces of info (not relying on info received elsewhere or estimated) we have available to estimate a parameter

  • so, whenever we estimate a parameter, we are “using up” some of that freedom

28
New cards

when do we divide by n and when by n-1?

if we are describing the variability within a sample only, we may divide by n

  • ex, if we are using the sample sd as a descriptive statistic

if we are making inferences about a population, we need to divide by n-1

  • ex, in our sample sd stat we need to divide by n-1

29
New cards

sample mean vs variance calulation

sample mean - divide by n because it is independent

variance/sd - divide by n-1, because it is dependent on sample mean

30
New cards

null hypothesis testing

H0 = baseline model saying “there is no effect, no difference, or no relationship in the population”

  • we start with this, to test if a prediction is statistically possible/meaningful

HA/H1 = alternative hypothesis saying there is an effect, and a mean difference doesn’t equal zero

31
New cards

example of importance of alternate hypothesis

alt hypothesis determines what data you calculate

  • if you alternate only focuses on a specific direction (A is better than B), your confidence inference will only look at one end of the date

  • if your alternate is more general (there is a statistical difference between A and B), your confidence inference will consider both sides of data

    • important to do this to not limit yourself

32
New cards

the 2 rules to making decisions in stats

type 1 error

type 2 error

33
New cards

type 1 error

rejecting H0 when it is actually true

  • saying there is an effect when there isn’t one

  • a false alarm

34
New cards

type 2 error

accepting H0, when it is not true

  • saying there is no effect when there is one

  • a miss

35
New cards

type 1 error details

alpha set before collecting data (often 0.05, so 5%)

  • with alpha = 0.05, you will make a type 1 error about 5% of the time

based along the null curve

36
New cards

what is alpha? (type 1 error)

the significance level, used as a cutoff for deciding when to reject/accept H0

  • used to carve the rejection region

37
New cards

alpha used on a null curve

critical values = cut off null curve to alpha/2 is on the outside of the cutoffs

  • if the sample data falls outside of these critical values, we can reject null (H0)

38
New cards

type 2 error details

based on the alternate hypothesis curve

β is something we measure, dependent on:

  • population size

  • sample size

  • significant level

  • variance

if sample data falls inside the β, we can accept null

39
New cards

type 2 error (β) in relation to other parameters

bigger effect size → smaller β

larger n → shrinks standard error → smaller β

significant level (alpha) → stricter thresholds → rejecting H0 is harder → increases β

variance (σ2) → harder to detect effects → increases β

40
New cards

statistical power

the probability of correctly rejecting H0 when it is false (correctly determining an effect)

power = 1 - β

  • we want a small β

  • before conducting research, the researchers estimate the needed sample size to achieve acceptable power (80%, so β=.20)

AKA the sensitivity of a study

41
New cards

as sd decreases, what happens to β and alpha

their overlapping areas become smaller

  • this helps us minimize errors, to decrease both β and alpha

42
New cards

p-values

the probability of obtaining a result at least as extreme as the one observed, assuming the null hypothesis is true

  • quantifies to improbability of this data being generated under the null

  • does NOT tell us the probability of null being true

43
New cards

low vs high p-value

a low p-value suggests that the observed data is unlikely if the null hypothesis is correct, providing strong evidence for the alternative hypothesis

44
New cards

ronald a fisher (1920-30s)

introduced levels of significance as convenient thresholds for interpreting whether data provide evidence against H0.

45
New cards

Neyman and Pearson (1930s)

  • developed hypothesis testing framework for Type I error and Type II error

    • Their work put the p-value into a decision making context

46
New cards

CI and P value: two sides of the same coin

CI uses sample distribution, centered on sample mean

  • asks if H0 falls inside my plausible range

P-value uses null hypothesis as the distribution, centered on 0.5 (50%)

  • asks how much variance can be random

  • asks how extreme my data is if H0 were true

47
New cards

data that is seemingly not critical can be made critical if we have…

a big enough sample size

48
New cards

problem with relying on P value

does not tell us how big an effect is

even if we have calculated a statistical significance, it may be tiny IRL

we can get around this by calculating …..

49
New cards

effect size

the magnitude of an effect/strength of an association

  • standardized mean differences

  • independent of sample size

  • scale free, allows for comparison across studies

50
New cards

2 statistical approaches to evaluate an inference about a population

null hypothesis significance testing (NHST)

confidence intervals (CI)

  • should give us the same conclusions

51
New cards

NHST evaluations

start with null

compute a test statistic relative to the null

compare critical values ( or p-values )

decide whether to reject null or not

binary outcome: reject vs not reject

52
New cards

CI in making evaluations

construct an interval around the sample estimate

if the interval includes the null value, then that implies significance at the corresponding alpha

  • CI also shows plausible effect sizes, not just a decision

53
New cards

when do we rely on the t distribution

in CI calculations, we often can’t use z critical values because we don’t know population sd

  • z critical values are rarely used for inference

54
New cards

df calculation (in general)

n - 1

  • big sample size → big df (good!!!!)

55
New cards

t distribution

a family of distributions

  • looks like the standard normal (bell shaped, centered at 0)

  • has heavier tails to account for extra uncertainty from estimating σ

  • as n increases, t approaches the normal standard z

  • takes into account degrees of freedom (bigger df → close to normal)

56
New cards

bigger df leads to what (in t distributions)

big df → means a big sample → thinner curve → thinner confidence intervals

57
New cards

rule of thumb for df in t distributions

the benefits of an increased sample size asymptote at n=30 (df=29)

58
New cards

3 types of t tests

one sample t-test → tests if a sample differs from a known/hypothesized population mean

paired sample t-test → tests if means differ across two related measurements (e.g. before vs after)

independent two-sample t test → tests if two independent group means differ

59
New cards

one sample t test

determines whether mean of single sample differs significantly from specified population mean/hypothesized value

evaluates null hypothesis that the sample comes from, under assumption of approximate normality and independent observations

is the average score in one group far enough from a comparison value that chance alone is unlikely to explain it???

60
New cards

one sample t-test: CI approach

  1. propose a null hypothesis

  2. collect a sample

  3. compute descriptive statistics (x-bar, s, n, df, SEM)

  4. calculate CI based on df (quantile on t-distribution with df = n-1 at probability 1 - α/2

  • calculation based on sample distribution

  1. make decision (if CI includes null, fail to reject null)

61
New cards

one sample t-test: NHST approach

  1. propose null

  2. collect data

  3. construct null hypothesis distribution that explains probability by CHANCE, and calculate descriptive stats

  4. compute t value, and make a decision

62
New cards

when to reject with t values

reject H0 if | tobs | > t1 - α/2, df

  • we are comparing our observed t-value to the critical one

63
New cards

t test tables

gives us our critical t value

  • a table of df by alpha

  • meet in the middle to find our critical value

  • lets us compare the absolute value of our observed t-value to the critical

    • if our calculated t-value falls within the critical region, 95% by chance

64
New cards

one sample t-test: two tailed

alt hypothesis predicts a difference but not direction, critical region is split on BOTH ends of the distribution

  • α = 0.025 (bc α/2 on either side)

  • more conservative and requires a larger effect to reject null (since α is divided across both tails)

65
New cards

one sample t-test: one-tailed test

alt hypothesis predicts a specific direction, so critical region is entirely on only one side of the distribution

  • α = 0.05 on one side

  • more statistical power for detecting an effect but blind to effects in opposite direction

    • not ideal in the research world

66
New cards

assumptions of one sample t test

scale of measurement - dependent variable is continuous (interval or ratio scale)

normality - CLT for large samples, raw close to normal for small samples

independence - each obs is independent of the others (no repeated or paired)

population variance unknown

67
New cards

paired sample t test

  • often same individuals used twice

  • controls for individual differences

  • higher statistical power with small samples

  • but, there are carryover/practice effects and this method is not always possible

68
New cards

paired sample t-test: NHST approach

  1. propose null hypothesis

  2. now we have two conditions, so we can use difference scores between the conditions

  3. construct null hypothesis distribution and calc descriptive stats of the difference scores

  4. compute t value and make a decision

69
New cards

common features of pair-sample t tests

  • same participants

  • two measurements are meaningfully linked (before vs after, often within-subject design, etc.)

  • tests whether the mean difference between paired observations is significantly different than 0

70
New cards

assumptions of paired sample t test

pairs are meaningfully matched (but still independent)

differences between pairs are approximately normally distributed

71
New cards

paired sample vs one sample t tests

basically the same!

the same process is done, except with paired sample the distribution is the differences between 2 conditions

72
New cards

goal of t tests

to help us make a decision about the null hypothesis

73
New cards

t statistic theoretically

t observed = effect/ variability of the effect

  • the effect would be Xbar - Mu, how far the sample mean deviates from the null

  • the variability of the effect is the SEM, variability expected by chance

  • the two critical values are used to make the critical region (1 - α /2, df)

  • used to determine statistical significance

74
New cards

steps of t-statistics

  1. compute the t-statistic (effect/variability of the effect)

  2. determine critical t value (with α and df)

  3. compare

if the absolute value of our t statistic is greater than the critical t values, we reject H0

75
New cards

independent sample t test

tests whether independent groups differ significantly from eachother

  • works great for naturally distinct groups

  • requires larger sample size

  • more variability from individual differences

  • often cross-cultural, cross-sectional, gender differences, clinical vs non-clinical populations, teaching method comparisons, etc.

76
New cards

assumptions of independent sample t test

groups are independent

each data are roughly normal

equal variance across groups!!!

77
New cards

biggest differences in calculating independent t tests compared to the others

in the t statistic, the observed diff and the SEM must be calculated differently

in the critical t value, the degrees of freedom must be calculated differently

78
New cards

calculating t statistic for independent sample t tests

observed diff/SEM

  • for the observed diff, we just minus one effect (xbar1) from the other (xbar2)

  • the SEM is more complicated, as in using two independent samples it is hard to find a connecting of variance between them

79
New cards

SEM in t stat calculation for independent sample t tests: 2 ways

variance sum law

pooled variance

80
New cards

variance sum law

if two variables are independent, the variance of their sum (or difference) equals the sum of their variances

  • so, we literally calculate each SEM separately, and just add them

81
New cards

pooled variance

instead of estimating two variances, we combine them into a single pooled estimate

  • this one is more accepted

  • every sample will have its own amount of variability, and sampling error will always be included in the calculation of variance

    • the solution is to average the two sample variances

  • by dividing two samples, we can get rid of sampling error

  • this also takes into account the size of both samples

82
New cards

what is sampling error

the difference between a sample statistic and the true population parameter

  • arises because the sample is only an approximation of the entire population

  • can be decreased with an increased sample size

83
New cards

SEM calculation for independent sample t tests

square root of the pooled variance

84
New cards

benefits of pooled variance

more accurate estimate

higher statistical power

historical and computational simplicity

connection to ANOVA

85
New cards

df in independent sample t test

df = n1 + n2 - 2

when finding our critical t value, our df changes

  • it becomes bigger since we have two sample sizes

86
New cards

most important assumptions with independent sample t test

equal variance across groups

87
New cards

2 ways we can test if there is equal variance across groups

levene’s test - a statistical method to check if variances are equal across two or more groups, used in ANOVA too

common rule of thumb - check if the ratio of the larger sample : smaller sample is less than 4

  • or, less than 2 if looking at sample standard deviations

  • more risky and less conservative

88
New cards

consequences if the equal variance assumption isn’t met

type 1 error increases

  • in some situations, it decreases too (if the larger n has a larger 𝜎), making it almost impossible to find an effect

  • if smaller n has a larger 𝜎, there is 1/3 a chance of a type 1 error

89
New cards

measuring effect in 3 t tests

one-sample: Xbar - Mu

paired: Xbar1 - Xbar2

independent: Xbar1 - Xbar2

90
New cards

SEM in the 3 t tests

one-sample: s/√n

paired: sdifferences/√npair

independent: √(pooled variance)

91
New cards

df in the 3 t tests

one sample: n-1

paired: npair - 1

independent: (n1 - 1) + (n2 -1)

92
New cards

what is calculated the same in all 3 t tests?

t critical value and confidence interval

93
New cards

what does n represent in paired sample t tests

the number of paired/grouped samples, not just individual samples

  • for ex, participant 1’s before and after are seen as 1, not split into 2

94
New cards

solution for when the two samples’ variances don’t match

using welch’s t test

  • changes how SEM and df are calculated

95
New cards

welch’s t test

corrects df and SEM to allow for a more conservative estimate

  • as variance ratio grows and n1 shrinks, df shrinks

  • as sample size becomes increasingly unequal, the df will increase

  • ideally, we want a ratio of 1

welch’s t test allows for type 1 error rate to stay around 0.05, no matter the inequality of variance

96
New cards

is increasing sample size to get a statistical P (0.05) p-hacking?

no, because we aren’t lying

97
New cards

how n impacts test statistic and p values

bigger n → test statistic grows

big n → even big effects can yield small p-values (statistical significance)

small n → moderate effects may fail to reach significance

  • P values do not tell us how big real effects are

98
New cards

raw effect size in relation to t statistic

the numerator

99
New cards

effect size characteristics

magnitude - quantifies how big an effect is

independent from sample size - makes it a better measure of practical importance than the p-value (which is heavily influenced by number of observations, n)

compliments statistical significance - report alongside statistical significance tests, to give the complete picture of the research outcome

100
New cards

cohen’s d

effect size measure in t tests

the standardized difference between two means (divided by sd)

  • 0.2 (small)

  • 0.5 (medium)

  • 0.8 (large)