Ch 13 Inferential Statistics
**__CHAPTER 13 INFERENTIAL STATISTICS__**
\n
13\.57 Understanding Null Hypothesis Testing
\n
The Purpose of Null Hypothesis Testing
ㅁ **statistics** → descriptive data that involves measuring 1(+) variables in a sample & computing descriptive summary data (eg means, correlation coefficients) for those variables
ㅁ Researchers goal is to draw conclusions about pop that sample was selected from (not the sample). Thus, researchers use sample stats to draw conclusions abt the corresponding values in the pop (parameters)
ㅁ **parameters** → corresponding values in the pop
ㅁ sample stats aren’t perfect estimates of their corresponding pop parameters bc there’s a certain amount of random variability in any stat from sample to sample.
ㅁ **sampling error** → the random variability in a stat from sample to sample. (term error refers to random variability, not anyone making a mistake)
ㅁ any stat relationship in a sample can be interpreted in 2 ways
→ there’s a relationship in the pop, & relationship in the sample reflects this
→ there’s no relationship in the pop, & relationship in sample reflects only sampling error
ㅁ purpose of null hypo testing is to help researchers decide btwn these 2 interpretations ^^
\n
The Logic of Null Hypothesis Testing
ㅁ **Null hypothesis testing** → (often called null hypothesis significance testing or NHST) is a formal approach to deciding btwn 2 interpretations of a stat relationship in a sample
ㅁ **Null hypothesis** → one interpretation from ^^. The idea that there’s no relationship in the pop & the relationship in the sample reflects only sampling error (symbolized H0, “H-zero”)
ㅁ **Alternative hypothesis** → another interpretation from ^^. This hypothesis proposes that there’s a relationship in the pop & that relationship in the sample reflects this relationship in the pop. (symbolized as H1)
ㅁ Every statistical relationship in a sample can be itnerpreted in either of these 2 ways: → it might have occurred by chance
→ might reflect a relationship in the pop
Although there r many specific null hypothesis testing techns, all based on same general logic. Steps are:
→ Assume for the moment that the null hypo is true. There’s no relationship btwn the variables in the pop
→ Determine how likely the sample relationship would be if the null hypothesis were true
→ If the sample relationship would be extremely unlikely, then __reject the null hypothesis__ in favor of the alternative hypothesis. If it would not be extremely unlikely, then __retain the null hypothesis__.
ㅁ **Reject the null hypothesis** → a decision made by researchers using null hypothesis testing which occurs when the sample relationship would be extremely unlikely
ㅁ **Retain the null hypothesis** → a decision made by researchers in null hypothesis testing which occurs when the sample relationship would not be extremely unlikely
ㅁ **p value** → crucial step in null hypo. The probability of obtaining sample result or more extreme result if null hypo were true. Not probability that any particular *hypo* is true or false. Instead, probability of obtaining the *sample result* if null hypo were true.
→ low p value means sample/more extreme result would be unlikely if null hypo were true & leads to rejection of null hypo. Not-low *P* value means sample/more extreme result would be likely if null hypo were true & leads to retention of null hypo.
ㅁ **a (alpha)** → the criterion that shows how low a p-value should be before the sample result is considered unlikely enough to reject the null hypothesis (usually set to 0.05).
ㅁ if there’s a 5% (or less) chance of a result @least as extreme as the sample result if null hypothesis were true, then null hypo is rejected. When this happens, result said to be **statistically significant** → an effect that’s unlikely due to random chance & therefore likely represents a real effect in the pop
ㅁ if there’s >5% chance of result as extreme as the sample result when the null hypo is true, then the null hypo is retained. (This doesn’t necce mean researcher accepts null hypo as true, just that isn’t enough evidence to reject. Use “fail to reject the null hypo” versus “retain the null hypo.” Never use “accept the null hypo”)
\n
The Misunderstood p Value
ㅁ the p value = one of most misunderstood quantities in psycho research. Most common misinterpretation that p value is probability that null hypo is true–that sample result occurred by chance. INCORRECT. P value is really the probability of a result @least as extreme as the sample result *IF* the null hypo *WERE* true.
ㅁ null hypo test involves answering q: “If null hypo were true, what’s the probability of a sample result as extreme as this one?” In other words, “What is the *p* value?”
ㅁ answer to question depends on 2 considerations: → strength of relationship & size of sample. The stronger the sample relationship & the larger the sample, the less likely the result would be if the null hypo were true. That is, the lower the *p* value.
ㅁ Sometimes result weak & sample large, or result strong & sample small. In these cases, the two considerations trade off against each other, so a weak result can be stat sig if sample is large enough, & a strong relationship can be stat sig even if sample is small.
| ㅁ Table 13.1 shows roughly how relationship strength & sample size combine to determine whether a sample result is statistically significant. ㅁ __Columns__of table represent 3 levels of relationship strength: weak, medium, and strong. ㅁ __Rows__represent 4 sample sizes that can be considered: small, medium, large, & extra large in context of psycho research. ㅁ __Each cell__represents a combo of relationship strength & sample size. \n ㅁ If a cell contains word*Yes*, then combo would be stat sig for both Cohen’s *d* & Pearson’s *r*. If contains word *No*, wouldn’t be stat sig for either. ㅁ There’s 1 cell where decision for *d* & *r*would be dif & another where it might be dif depending on some additional considerations (discussed in Section 13.2 “Some Basic Null Hypothesis Tests”) \n Although Table 13.1 provides only a rough guideline, it shows v clearly that weak relationships based on medium/small samples r never stat sig & that strong relationships based on medium/larger samples r always stat sig. |  |
|----|----|
ㅁ If keep in mind, you'll often know whether result is stat sig based on descriptive stats alone. Useful to develop intuitive judgment.
→ One reason is that it allows you to develop expectations about how your formal null hypothesis tests are going to come out, which in turn allows you to detect problems in your analyses.
→ A second reason is that the ability to make this kind of intuitive judgment is an indication that you understand the basic logic of this approach in addition to being able to do the computations.
\n
Statistical Significance Versus Practical Significance
ㅁ A stat sig result is not necce a strong one. Even a very weak result can be stat sig if it’s based on a large enough sample.
ㅁ word *significant* can cause peop to interpret these difs as strong/important. However, these sta sig difs r actually quite weak. (this is why it is important to distinguish between the *statistical* significance of a result and the *practical* significance of that result)
ㅁ **practical significance** → refers to importance/usefulness of result in some real-world context
→ in clinical practice, same concept often referred to as “clinical significance”
ex) study on a new treatment for social phobia might show that it produces a stat sig + effect. Yet this effect still might not be strong enough to justify time, effort, & other costs of putting it into practice (esp if easier/cheaper/work-just-as-well treatments alr exist) Although stat sig, result said to lack practical or clinical sig.
\n
13\.58 Some Basic Null Hypothesis Tests
\n
ㅁ **t-Test** → a test that involves looking @ the dif btwn 2 means. 3 types used for slightly dif research designs: one-sample t-test, the dependent samples t-test, & the independent-samples t-test
ㅁ **One-sample t-test** → used to compare sample mean (M) w hypo pop mean (μ0) that providesinteresting standard of comparison
ㅁ The null hypo is that the mean for the population (µ) is equal to the hypothetical population mean: μ = μ0. 
ㅁ The alternative hypothesis is that the mean for the pop is dif from the hypo pop mean: μ ≠ μ0.
ㅁ To decide btwn these 2 hypos, need to find probability of obtaining the sample mean (or one more extreme) if the null hypo were true. But finding this *p* value requires first computing a test statistic called *t*. The formula for *t* is as follows:
\n
ㅁ **Test Statistic** → a statistic (eg F, t, etc) that’s computed to compare against what’s expected in the ull hypo, & thus helps find the p value. Useful bc we know how it’s distributed when null hypo is true.
| ㅁ Its precise shape depends on a stat concept called __the degrees of freedom__, which for a one-sample *t*-test is *N* − 1. (There are 24 degrees of freedom for the distribution shown in Figure 13.1.) ㅁ this distribution makes it possible to find the *p* value for any *t*score. \n ㅁ we dont have to deal directly w the distribution of*t* scores. If we were to enter our sample data & hypo mean of interest into 1 of online statistical tools in Ch12 or a program like SPSS (Excel doesn’t have a one-sample \*t-\*test function), the output would include both the *t* score & the *p* value. ㅁ If *p* is = or