Basic Statistics - Hypothesis Testing
ON TEENAGERS, ADULY
Statistics indicate that teenage pregnancy rates drop significantly after age 25.
Cited contributor: Mary Anne Tebeds, Regalicon state senator from Cetorado Springs, commented by Harry F. Punce.
BASIC STATISTICS FOR THE BEHAVIORAL SCIENCES
Author: Gary W. Heiman
5th edition
Chapter Ten: Introduction to Hypothesis Testing
NEW STATISTICAL NOTATION
Greater than: >
Less than: <
Greater than or equal to: ≥
Less than or equal to: ≤
Not equal to: ≠
THE ROLE OF INFERENTIAL STATISTICS IN RESEARCH
Inferential statistics are pivotal in determining whether observed relationships in samples reflect true relationships within the larger population.
RELATIONSHIPS IN EXPERIMENTS
As the conditions of the independent variable are altered (for example, varying drug doses such as 5mg, 10mg, 20mg), it's expected to observe a pattern of differences in the dependent variable (for example, measuring pain relief).
SAMPLING ERROR
Definition: Sampling error occurs when random chance results in a statistic from the sample that does not accurately represent the population parameter.
Clarification: A sample's mean, median, mode, standard deviation may be so unusual that it clearly does not represent the population from which it was drawn.
RELATIONSHIPS & SAMPLING ERROR
Example Scenario 1:
If an experiment is conducted on gender and artistic ability, and the sample inadvertently contains many highly artistic men and less artistic women, the conclusion drawn could wrongfully indicate a difference in creativity between genders.
Example Scenario 2:
Conducting research on gender and height might yield misleading results if the sample includes unusually tall women and short men, possibly suggesting incorrect conclusions about average height differences between genders.
Implication: Sampling error can lead to misleading interpretations, including falsely identifying relationships where none exist or vice versa.
INFERENTIAL STATISTICS
Purpose: Inferential statistics help assess whether a relationship observed in a sample is representative of the overall population or if it's a product of random sampling error.
PARAMETRIC STATISTICS
Definition: Statistics that assume certain parameters about the population being represented must be met.
Common parameters for parametric tests:
The distribution of dependent scores should form a normal distribution.
Scores must be measured on an interval or ratio scale.
NONPARAMETRIC PROCEDURES
Definition: Nonparametric statistics are inferential procedures that do not require strict assumptions about the populations being represented.
Applicability: These procedures are useful with nominal or ordinal data, or skewed interval or ratio data.
ROBUST PROCEDURES
Explanation: Parametric procedures are termed robust because even if the data doesn't meet the assumptions perfectly, it results in only a negligible amount of error in inferences.
SETTING UP INFERENTIAL PROCEDURES
Steps in setting up an experiment:
Create a hypothesis.
Design an experiment to test the hypothesis.
Translate the hypothesis into a statistical hypothesis.
Select the appropriate statistical procedure to test the hypothesis.
EXPERIMENTAL HYPOTHESES
Definition: Experimental hypotheses predict outcomes of experiments.
Types:
Null hypothesis: posits no relationship (the default).
Alternative hypothesis: posits that a relationship does exist.
PREDICTING A RELATIONSHIP
Two-tailed test: Used when predicting a relationship without knowing the direction (e.g., increase or decrease).
One-tailed test: Used when predicting the direction of a relationship.
A ONE-SAMPLE EXPERIMENT
In performing a one-sample experiment, it is required to have prior knowledge of the population mean under different conditions of the independent variable.
Example: Administering a “smart pill” to 100 people and measuring resulting IQs where the known population mean is established (e.g., typically, a population mean IQ might be 100).
CREATE YOUR STATISTICAL HYPOTHESIS
To validate the experiment, translate the theory into a statistical hypothesis, evaluating potential success and failure in numeric terms.
NULL HYPOTHESIS
Representation: H_0: ext{m} = 100
Meaning: Predicts no effect of the drug on IQ, thus suggesting the sample mean will be the same as the population average (100).
ALTERNATIVE HYPOTHESIS
Representation: H_a: ext{m}
eq 100Meaning: Predicts the sample mean will differ from the population average (100).
LOGIC OF STATISTICAL TESTING
Hypothetical scenario: If a sample's mean IQ is 105, one might wrongly attribute this as evidence the pill works, while in reality, it could be a result of drawing an atypical sample due to sampling error.
PERFORMING THE z-TEST
Definition: The z-test computes a z-score for a sample mean to evaluate its position on a sampling distribution of means.
ASSUMPTIONS OF THE z-TEST
Conditions include:
Randomly selected sample.
Dependent variable should be at least approximately normally distributed.
Knowledge of the population mean under a condition of the independent variable.
True standard deviation of the population must be known.
SETTING UP FOR A TWO-TAILED TEST
Process involves determining the alpha level (e.g., typically ext{p} = 0.05), identifying the rejection region, and determining the critical value (critical z-value).
Critical z-value for two-tailed tests: ext{z}_{crit} = ext{±}1.96.
REJECTING H0
If z_{obt} lies in the rejection region (exceeds the critical value), hypothesis {H_0} is rejected in favor of the alternative hypothesis {H_a}, indicating significant results.
Note: Significant results indicate that the outcomes are unlikely to have occurred due to chance, not necessarily that they are important or useful.
INTERPRETING SIGNIFICANT RESULTS
Rejecting H_0 does not equate to proving H_0 false; it merely indicates results unlikely to occur by chance.
FAILING TO REJECT H0
If z_{obt} does not fall in the region of rejection, we do not reject H_0.
These results are termed non-significant, suggesting any observed differences could result from sampling error, implying no true relationship exists within the population.
INTERPRETING NONSIGNIFICANT RESULTS
Failing to reject H_0 does not prove it true, similar to a jury declaring a defendant not guilty without proving innocence.
SUMMARY OF THE z-TEST
Determine experimental hypotheses (theory) and create statistical hypothesis (math).
Compute z_{obt}.
Set up sampling distribution for H_{0} (identify rejection regions and critical values).
Compare z_{obt} with critical z-value (z_{crit}).
THE ONE-TAILED TEST
Definition: Utilized when a predicted direction of the effect on scores is established.
ONE-TAILED HYPOTHESES
When an increase is predicted:
Null hypothesis: H_0: ext{m} ext{≤} 100 (predicts no increase or potentially a decrease).
Alternative hypothesis: H_a: ext{m} > 100 (predicts an increase).
When a decrease is predicted:
Null hypothesis: H_0: ext{m} ext{≥} 100 (predicts no decrease or potential increase).
Alternative hypothesis: H_a: ext{m} < 100 (predicts a decrease).
PRACTICE PROBLEMS
Problem 1: Determine hypotheses for IQ-related television watching (2.5 hours). ( H_0: m ) and ( H_a: m ).
Problem 2: Assess Pepperdine students in memory tests (average of 7 numbers). ( H_0: m ) … and ( H_a: m ) ….
ERRORS IN STATISTICAL TESTING
TYPE I ERRORS
Definition: Falsely rejecting H_0 when it is true.
Result: An unlikely sample prompts incorrect belief in a relationship that does not exist.
Probability: Theoretical probability of the type I error affirms as ( ext{α} ).
TYPE II ERRORS
Definition: Retaining H_0 when it is false.
Result: Sample produces typical scores suggesting a non-existent relationship actually does exist.
Probability: The theoretical probability of a type II error is represented as ( ext{β} ).
SUMMARY OF ERRORS
Error types are depicted in a comparative table showing false positives (Type I) and false negatives (Type II).
BALANCING ACT IN ERRORS
Key takeaway: Minimizing Type I errors often leads to under-powering the study, increasing the risk of Type II errors.
Researcher strategies include:
Utilizing lower alpha levels to minimize Type I errors.
Structuring studies to maximize statistical power.
POWER IN RESEARCH
Definition: Power refers to the probability of correctly rejecting H_0 when it is in fact false.
Strategies to increase power:
Design studies using parametric procedures.
Utilize directionally predictive one-tailed tests.
Zcrits IN ONE VS. TWO TAILED TESTS
In one-tailed tests, the critical z-value is often closer to the mean, allowing samples to be less extreme to qualify for rejection of H_0.