Null Hypothesis Significance Testing

good research question:

researchable and realistic
informed by the prior research (gap in the literature/need for replication)
not too broad/narrow
allows us to generate testable predictions (hypotheses) that are relevant to our research aims

what is a hypothesis?

a testable prediction - a statement about what we reasonably believe our data will show
the prediction is based on some prior info (literature + research)

conceptual hypothesis

describes the prediction in conceptual terms
can be defined in terms of the direction of the effect (direction vs non-directional)

operational hypothesis

null vs alternative hypotheses

alternative hypothesis - prediction that there will be a significant difference
null hypothesis - negation of the prediction we’re making; there will not be a significant difference
* null hypothesis testing - trying to decide whether we can reject the null hypothesis

decide on the α (alpha) level

α level - the level of false-positive findings that we’re willing to accept if the null hypothesis is true
* how often are we willing to incorrectly reject the null hypothesis? (i.e. Type 1 error) should be based on cost-benefit analysis - how risky is it to be wrong?
* often use an α rate of 5% (0.05)

calculate the test statistic

compute the p-value

the probability of observing the test statistic at least as large as the one we observed if the null hypothesis is true

accept or reject the null hypothesis

to reject the null hypothesis, the probability of obtaining the test statistic needs to be less than the critical α

doesn’t tell us the probability of the null or alternative hypotheses being true
dichotomous thinking around the α level - can encourage questionable research practices (can eventually find a statistically significant result somewhere)
*p-*values are sensitive to sample size

effect sizes - how large/meaningful is the effect
confidence intervals - what are the plausible limits of our effect?
preregistration - specifying our analytic plan before we look at the data
registered reports - scientific paper can get accepted before data collection based on proposed methodology instead of results
power analysis - determining the necessary sample size before we begin data collection

power analysis - determining the necessary sample size before we begin data collection
  * makes non-significant effects easier to interpret
  * reduces the probability of missing an important finding (false-negative/Type 2 error)
  * reduces over-sampling + wasting resources
  * discourages ‘data-peeking’
  * generally aim for a statistical power of 80%
statistical power - the probability of detecting an effect of a certain size as statistically significant (assuming the effect exists)
* the greater the effect of interest, the fewer participants needed