1/16
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
statistical power
the likelihood of a significance test detecting an effect when there actually is one
if it genuinely exists in the population at the size you’re assuming
if there is a real effect, what’s the chance of replicating it
probability of avoiding a type 2 error
aim for 80% or higher
the higher this is, the lower the risk of making a type 2 error
power of 0.8 means if there is a real effect, there’s an 80% chance of replicating it and finding the same/larger
high power
indicates large change of a test detecting a true effect
too high power means high sensitivity to true effects
may lead to significant results with small effects = little usefulness
low power/underpowered studies
indicates a small chance of a test detecting a true effect
or the results are likely to be distorted by random and systematic error
more likely to miss real effects
increases false negatives/type 2 error
more likely to only catch large effects
when statistically significant result is found, calculated effect size often an overestimation of true effect
inflated effect sizes
makes findings unstable and unreproducible
statistical power represents set of t-statistics, assuming there was an effect of the size measured in study
t-statistic below the threshold line
if there was a real effect, what is the chance of finding no effect?
type 2 error/beta
t-statistic above the threshold line
if there is a real effect, what is the chance of replicating it?
statistical power/1-beta
the result of this shows the % chance of being able to replicate it
to increase this %, have more participants
what influences power
sample size (df)
effect size
significance level
aka tolerance for type 1 error/p-value
use these for a power analysis
to determine the appropriate sample size
how to increase power
increase effect size
eg. manipulate IV more widely
increase sample size, but only up to a point
increase significance level
but this increases risk of type 1 error
reduce measurement error
increase precision and accuracy
power analysis
the planning tool to help justify sample size
balances it between being:
too small and insufficiently powered to be useful and …
too big and overpowered it’s wasteful/more than necessary
requires:
statistical power (80%)
effect size
significance/alpha level (p-value of 5%/0.05)
sample size
why a power analysis depends on an assumed effect size
since the ability to detect power is directly proportional to the magnitude of that difference
conduct power analysis before collecting data
want to use a sample size (n) giving reasonable chance of detecting the effect we care about
if N is too small
power drops
study is more likely to miss real effects/false negatives
if N is large
power increases
will need more time, money and participant effort than necessary
effect size
magnitude of difference between groups/relationship between variables
indicates practical significance of a finding
defines distance between null and alternative distribution
used to make informed assumption about what the real effect might be
defined independent of the sample size
small does not mean unimportant
especially matter when they affect lots of people/accumulate over time
always a discrepancy between observed and true effect size
can vary due to random factors, measurement error or natural variation in sample
estimating effect size using prior work
use effect sizes from previous, comparable work
with similar design, measures and population
published effects can be inflated however
so treat prior estimates as a starting point
estimating effect size using SESOI
smallest effect size of interest
deciding the smallest effect that is theoretically meaningful/practically important
often better than copying previous study’s size
as it forces a justification of what actually matters
estimating effect size using rule of thumb values
be transparent about conventions being used
eg. Cohen’s d mean differences
small - 0.2
medium - 0.5
large - 0.8
significance level
correlated with power
increasing sig level increases power eg. from 5-10%
decreasing level makes it less sensitive to detecting true effects
5% level means findings have a <5% chance of occurring under the null hypothesis when statistically significant
what a power analysis does when it estimates the minimum sample size
it indicates the smallest sample needed to detect a statistically significant effect with high power/confidence (80%)
while keeping type 2 error rates low
ensures cost-effective study
type 1 error
rejecting null hypothesis (an effect) when there isn’t an effect
false positive
concluding results are statistically significant when they actually came about by chance/unrelated factors
type 2 error
failing to reject the null hypotheses (no effect) when there actually is an effect
false negative
failing to conclude significant results
likely to come from low statistical power