Module 12 – 1 Tailed + 2 Tailed Tests + Rejecting The Null

One-Tailed vs. Two-Tailed Tests

Determining Cutoff Sample Score

  • Step 3 of null hypothesis testing involves determining the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected.

  • Typically, alpha is set at 0.05, resulting in cutoff values of +1.64 or -1.64.

One-Tailed Tests

  • One-tailed tests are directional, with the 5% significance level placed on either the left or right tail of the distribution.

  • If the sample mean is expected to be higher:

    • The 5% is placed on the right tail.

    • Example:

      • Research hypothesis: People who hear about positive traits will give a higher mean attractiveness rating than people who don’t.

      • Null hypothesis: People who hear about positive traits will give the same mean attractiveness rating as people who don’t.

    • If the sample’s Z score is higher than 1.64, the null hypothesis is rejected, with a 5% chance of error.

  • If the sample mean is expected to be lower:

    • The 5% is placed on the left tail.

    • Example:

      • Research hypothesis: People who take a painkiller will give a lower mean pain rating than people who don’t.

      • Null hypothesis: People who take a painkiller will give the same mean pain rating as people who don’t.

    • If the sample’s Z score is lower than -1.64, the null hypothesis is rejected, with a 5% chance of error.

Two-Tailed Tests

  • Two-tailed tests are used to determine if there is any effect, regardless of direction.

  • Example: How will anti-anxiety medication affect the grades of students with really bad test anxiety?

    • Grades might improve or get worse.

  • In this case, the 5% significance level is split, with 2.5% on the right and 2.5% on the left, using cutoff values of ±1.96.

  • Example:

    • Research hypothesis: The GPA of students who take anti-anxiety medication will be different than the GPA of students who don’t (H<em>1:μ</em>1μ2H<em>1: \mu</em>1 \neq \mu_2).

    • Null hypothesis: Students who take anti-anxiety medication will have the same mean GPA as students who don’t (H<em>0:μ</em>1=μ2H<em>0: \mu</em>1 = \mu_2).

Z-Scores and Decision Making

  • If the sample’s Z score is higher than +1.96 or lower than -1.96, the null hypothesis is rejected, with a 5% chance of error.

  • Step 4 involves finding the sample’s score on the comparison distribution.

  • Step 5 involves checking if the score is more extreme than -1.96 or +1.96.

    • If the score is between these points, the null hypothesis is not rejected.

    • If the score is more extreme, the null hypothesis is rejected.

Two-Tailed Example: Cat Ownership and Neuroticism

  • Research question: Does owning a cat affect neuroticism in undergraduates?

  • The average score for non-cat students on the Louisville Undergraduate Neuroticism Index (LUNI) is 5 with a standard deviation of 4.

  • A sample of 36 students is given a cat, and after 30 days, their LUNI score is measured.

  • The mean LUNI score for the group is 7.

Step 1: Hypotheses
  • Research hypothesis: The LUNI scores of students who get a cat will be different than the LUNI scores of students who don’t own a cat (H<em>1:μ</em>1μ2H<em>1: \mu</em>1 \neq \mu_2).

  • Null hypothesis: Students who get a cat will have the same mean LUNI score as students who don’t (H<em>1:μ</em>1=μ2H<em>1: \mu</em>1 = \mu_2).

Step 2: Comparison Distribution
  • Given population characteristics: μ=5\mu = 5 and σ=4\sigma = 4.

  • Mean of the distribution of means: μM=μ=5\mu_M = \mu = 5.

  • Standard deviation of the distribution of means: σM=σn=46=.67\sigma_M = \frac{\sigma}{\sqrt{n}} = \frac{4}{6} = .67.

Step 3: Cutoff Score
  • Alpha is set at 0.05, using a non-directional test.

  • Cutoff score: ±1.96\pm 1.96.

  • Any score higher than +1.96 or lower than -1.96 will result in rejecting the null hypothesis.

Step 4: Sample's Score
  • Formula: Z=Xˉμ<em>Mσ</em>MZ = \frac{\bar{X} - \mu<em>M}{\sigma</em>M}.

  • If the sample’s mean on the LUNI is 7:

    • Z=(75).67Z = \frac{(7-5)}{.67}

    • Z=2.99Z = 2.99

Step 5: Decision
  • Cutoff: ±1.96\pm 1.96.

  • Sample’s Z score: 2.99.

  • 2.99 is more extreme than +1.96, so reject the null hypothesis.

  • Conclusion:

    • The sample likely did not come from the comparison distribution.

    • Cats appear to have an effect on neuroticism in college students.

    • Students receiving cats scored significantly higher on the LUNI than non-cat owners.

Similarities and Differences

  • Two-tailed tests are very similar to one-tailed tests.

  • Instead of one cutoff (+1.64 or -1.64), there are two cutoffs (+1.96 and -1.96).

  • All other hypothesis testing principles still apply.

Practice vs Reality

  • It’s rare to design a study that makes no directional predictions.

    • Psychologists usually believe something will increase or decrease performance.

    • However, psychologists almost always do two-tailed tests.

  • So basically,

    • The rule is:

      • If research hyp. is nondirectional, then do two-tailed test.

      • If research hyp. is directional, then do one-tailed test.

    • The reality is:

      • If research hyp. is nondirectional, then do two-tailed test.

      • If research hyp. is directional, then do two-tailed test.

Stringency of Two-Tailed Tests

  • Two-tailed tests are more stringent.

  • The sample score has to be more extreme/unusual in order to reject the null hypothesis.

  • If you do a one-tailed test your mean only needs to be 1.64 SDs away from the mean for you to reject the null.

  • If you do a two-tailed test then your mean needs to be at least 1.96 SDs away from the mean.

One- vs Two-Tailed Tests

  • One-tailed test: “My sample score has to be at least this far from the mean before I’ll decide that my sample probably didn’t come from a population like this one.”

  • Two-tailed test: “My sample score has to be at least this much farther from the mean before I’ll decide that my sample probably didn’t come from a population like this one.”

Psychologists' Beliefs on Decision Making

  • Psychologists believe the decision to reject the null hypothesis is more likely to be correct if they do a two- versus a one-tailed test.

  • 1.96 is the point where only 2.5% of scores are higher so it’s almost like setting your alpha to .025.

  • In fact, if I asked you to do a one-tailed test and set your alpha to .025 you would look up the corresponding Z score on the table, which is 1.96, and use that for your cutoff.

FAQ

  • So why not just set your alpha to .025 and follow the rules?

  • If you set your alpha to .025 and do a one-tailed test then your fine if everything works out as planned, but what if you get an extreme score on the other side?

  • For example, you could be testing the efficacy of an anti- smoking campaign for teens and you set your cutoff at -1.96 thinking your campaign would lower the mean number of teens who started smoking. If your sample had a Z score of 4.2 that would be very interesting! Your campaign actually increased the amount of teens that started smoking.

  • Unfortunately, if you were doing a one-way test you would fail to reject the null because your sample’s Z score was not more extreme than -1.96.

  • A two-tailed test using -1.96 and +1.96 would catch this extreme score.

  • You might be surprised at how often this occurs.

  • An extreme result in either direction is usually interesting so it’s better to just use a two-tailed test so you can detect any extreme score.

  • Why not just do a one-tailed test and change the direction later if you need to?

  • For one that’s cheating. It’s like calling heads before flipping a coin, but reserving the right to change to tails after the flip.

  • If you put your 5% on one tail but reserve the right to move the 5% to the other tail then your still going to have a 10% chance of rejecting the null hypothesis when the null hypothesis is true. It’s the same as setting your alpha at .1 which is way too high.

Tails?

  • There is no right answer for this.

  • Two-tailed tests with an alpha of .05 are the standard so we’re going to mostly be working with two-tailed tests from this point forward.

  • If I ever expect a one-tailed test in this course then I’ll specifically ask you to use one.

  • If you’re using inferential tests outside of this course then you will need to decide your alpha level and if you want to allocate it to one tail or two.

  • We’ll talk more about alpha and errors in the next module. Hopefully, you’ll get a better idea of when you might want to use a different alpha level or why someone might want to only use a one-tailed test.

  • Before we end the module I want to take a moment to talk about rejecting the null hypothesis.

Rejecting the Null Hypothesis

  • When we reject the null hypothesis we are making a decision based on evidence from a sample.

  • Like any decision, it may be correct or incorrect

  • The fact that a sample score is unusual when compared to a known distribution of scores is not proof that the sample came from a different distribution

    • It is just evidence that supports the idea that it might be from a different distribution.

  • If the evidence leads us astray, then we make a decision error

    • Using one- or two- tailed tests, or changing your alpha level will affect your chances of making different types of decisions errors.

  • It might seem strange to you that psychologists can pick any cutoff they want.

  • Keep in mind that changing your cutoff to get a significant difference is not the same as detecting a difference between the sample and the comparison distribution.

  • It’s never good to publish an incorrect decision.

    • Other scientists won’t be able to replicate your finding.

    • Studies based on your study will be flawed.

    • Some research has the potential to cause harm if incorrect.

  • So just remember that rejecting the null hypothesis doesn’t prove anything, it just provides support for your research hypothesis.

  • Always be careful with the language you use when describing the results of an inferential test.

Replication

  • Because there is always a chance you can make an incorrect decision many scientists like to replicate their studies at least once before they publish them.

    • If you repeat the study and get a significant result again then it is still possible you made another error, but the probability drops from .05 to .0025.

    • This is why you will often see research articles with 2 or 3 studies. The researcher is changing the study slightly each time to learn more about what they are studying, but also replicating their finding to reduce the chance of a decision error.