Null Hypothesis Significance Testing

Forming a research question

  • confirmatory research - focus on testing specific hypotheses
  • exploratory research - doesn’t necessarily start with a hypothesis

good research question:

  • researchable and realistic
  • informed by the prior research (gap in the literature/need for replication)
  • not too broad/narrow
  • allows us to generate testable predictions (hypotheses) that are relevant to our research aims

Moving from research question to hypotheses

what is a hypothesis?

  • a testable prediction - a statement about what we reasonably believe our data will show
  • the prediction is based on some prior info (literature + research)

conceptual hypothesis

  • describes the prediction in conceptual terms
  • can be defined in terms of the direction of the effect (direction vs non-directional)

operational hypothesis

  • operationalisation - the process of translating the concepts into measures

null vs alternative hypotheses

  • alternative hypothesis - prediction that there will be a significant difference
  • null hypothesis - negation of the prediction we’re making; there will not be a significant difference
      * null hypothesis testing - trying to decide whether we can reject the null hypothesis

Formally testing hypotheses with statistics

decide on the α (alpha) level

  • α level - the level of false-positive findings that we’re willing to accept if the null hypothesis is true
      * how often are we willing to incorrectly reject the null hypothesis? (i.e. Type 1 error) should be based on cost-benefit analysis - how risky is it to be wrong?
      * often use an α rate of 5% (0.05)

calculate the test statistic

  • test statistic - a numeric value that we use to test the hypothesis
  • different test statistics for different situations (e.g. t, F, χ2, MDiff)

compute the p-value

  • the probability of observing the test statistic at least as large as the one we observed if the null hypothesis is true

accept or reject the null hypothesis

  • to reject the null hypothesis, the probability of obtaining the test statistic needs to be less than the critical α

Pitfalls of null hypothesis significance testing

  • doesn’t tell us the probability of the null or alternative hypotheses being true
  • dichotomous thinking around the α level - can encourage questionable research practices (can eventually find a statistically significant result somewhere)
  • *p-*values are sensitive to sample size

Ways to avoid pitfalls

  • effect sizes - how large/meaningful is the effect
  • confidence intervals - what are the plausible limits of our effect?
  • preregistration - specifying our analytic plan before we look at the data
  • registered reports - scientific paper can get accepted before data collection based on proposed methodology instead of results
  • power analysis - determining the necessary sample size before we begin data collection

Power analysis

  • power analysis - determining the necessary sample size before we begin data collection
      * makes non-significant effects easier to interpret
      * reduces the probability of missing an important finding (false-negative/Type 2 error)
      * reduces over-sampling + wasting resources
      * discourages ‘data-peeking’
      * generally aim for a statistical power of 80%
  • statistical power - the probability of detecting an effect of a certain size as statistically significant (assuming the effect exists)
      * the greater the effect of interest, the fewer participants needed