t-Tests and Inference: One-Sample and Independent-Samples (Comprehensive Notes)
One-Sample t-Test: concepts and interpretation
Purpose: tests whether the mean of a single sample differs from a known population mean μ.
Key language shift: raw scores (X scores) vs. t-scores. The t-score is a standardized value expressed in units of standard error.
Raw score example from transcript:
Population mean μ = 40
Sample mean X̄ = 42
Sample size n = 36
Reported t ≈ 2.03 (positive because X̄ > μ)
Formula for the one-sample t-statistic: t=nsX−μ
where s is the sample standard deviation and SE=ns is the standard error of the mean.
Interpretation of the t-score:
Positive t indicates the sample mean is above the population mean.
Negative t indicates the sample mean is below the population mean.
The population mean has t = 0 (zero standard errors away from itself).
Degrees of freedom (df): df=n−1
Rationale: use the sample SD to estimate the population SD; subtract 1 to reserve degrees of freedom for variability in estimation.
P-value and decision rule:
For a two-tailed test, the p-value is the proportion of the sampling distribution as extreme or more extreme than the observed t in either tail: p=P(∣T∣≥∣t<em>obs∣∣H</em>0)
Example from transcript: observed t = 2.03, with reported p ≈ 0.042 (two-tailed).
Alpha (significance level) choice: commonly α=0.05; sometimes α=0.01 (e.g., FDA drug trials).
Decision rule:
If p < \alpha, reject the null hypothesis (conclude a difference exists).
If p≥α, retain (fail to reject) the null hypothesis (no evidence of a difference).
Reporting results in APA format:
Statistical symbols in italics: T and P (and the test statistic type).
Include degrees of freedom after the T (e.g., t(35)).
Report the test statistic and p-value: e.g., t(35)=2.03,p=.042
Conceptual interpretation recap:
The t-test assesses whether the sample mean is a significantly different value from the population mean.
“Retaining the null” means the two things are the same or not significantly different.
“Rejecting the null” means the two things are significantly different.
Visualization intuition:
The sampling distribution of the mean under H0 is centered at μ with standard error SE.
The observed t-score marks a cut-off point; areas beyond ±t_obs form the critical region for a given α.
The p-value is the total area in the tails beyond the observed t in both directions for a two-tailed test.
Example recap from transcript:
Population mean μ = 40; sample mean X̄ = 42; n = 36; t = 2.03; p ≈ 0.042; α = 0.05 → reject H0; conclude there is a difference between sample and population means.
Connection to prior ideas:
t scores express how far the sample mean is from the population mean in units of SE, linking to the concept of standard error and the sampling distribution.
The null hypothesis and its rejection are evaluated via a distribution of possible sample means under H0.
The Language of X scores vs. t scores: practical notes
X scores: raw measurements (e.g., extraversion score on a scale).
T scores: transformed into a standardized metric using the standard error; tell you how many SEs the sample mean is from the population mean.
Transformation intuition: converting to t scores allows comparison across studies with different scales, because you’re measuring deviation in units of SE rather than raw units.
Example interpretation:
If a test on extraversion scale has population mean μ = 40 and the sample mean is X̄ = 42 with SE = 1, then t=(42−40)/1=2.0, indicating the sample mean sits 2 standard errors above μ.
Practical steps you’ll perform (in practice, often by computer):
Compute t, df, and p-value; decide to reject or retain H0; report results.
In lab/PSPP outputs, the computer supplies t, df, and p-value; you then format the result for publication.
Two-tail vs. one-tail tests and the p-value interpretation
Two-tailed test: tests whether the two means differ (either direction).
Alpha split across tails: α/2 in each tail (e.g., 0.025 per tail for α=0.05).
The p-value reflects both tails: the probability of being as extreme or more extreme in either direction.
One-tailed test: tests for difference in a specified direction (not used in the transcript but often taught).
In this lecture, emphasis is on two-tailed interpretation and the symmetrical t distribution.
Independent-samples t-test: concept and workflow
Purpose: compare the means of two independent groups on a numeric dependent variable.
Key structure:
Independent variable: nominal with exactly two levels (e.g., work vs not work; meat-eaters vs vegetarians).
Dependent variable: numeric (interval or ratio scale).
Example variants discussed:
People who work vs people who don’t: numeric outcome could be hours of productivity at work or another measurable metric.
Meat-eaters vs vegetarians: numeric outcome could be muscle mass or another measurable trait.
Two-group design intuition:
The independent variable partitions the sample into two groups.
The t-test assesses whether the two group means differ on the numeric outcome.
Classic example from Kitona (1940) on memorization vs understanding:
Two learning methods: memorization vs learning by understanding.
Benefit observed for understanding method in a memorization task (the matchstick problem).
Design: there were multiple problems (e.g., 12 problems like the matchstick task) to compare learning methods.
Finding: learning by understanding yielded better outcomes than memorization, illustrating a difference between two learning approaches.
Reporting and interpretation basics parallel the one-sample case but use a pooled or alternative variance estimate depending on equal-variance assumptions (noted in practice with more advanced variants).
Example outcomes and reporting follow APA conventions similar to the one-sample case but with two sample means and the corresponding df.
Formulas and key statistical details to memorize
One-sample t-test statistic:
t=nsX−μ
Degrees of freedom: df=n−1
Standard error: SE=ns
Two-sample independent t-test statistic (common version with pooled variance):
Pooled variance: s<em>p2=n</em>1+n2−2(n</em>1−1)s<em>12+(n</em>2−1)s<em>22
Test statistic: t=s<em>p2(n</em>11+n21)X<em>1−X</em>2
Degrees of freedom: df=n<em>1+n</em>2−2
P-value for two-tailed tests: p=P(∣T∣≥∣tobs∣)
Critical regions (for a two-tailed test):
Cutoffs at ±tα/2,df, where the p-value is the area beyond these cutoffs in both tails.
Reporting conventions (APA style):
Use italics for statistical symbols: T, P; include the degrees of freedom after T (e.g., t(35)).
Always state the null hypothesis clearly: there is no difference between the two means.
Interpret t-score directionally: positive t indicates the sample mean is greater than the comparison mean; negative t indicates it is smaller.
Decide on alpha ahead of time (e.g., 0.05); understand how two-tailed tests split alpha across both tails.
Remember the logic: small p-value means the observed extreme difference would be unlikely if the null were true; thus reject H0.
Note: The relationship between t, p, and alpha is a decision rule, not a claim of practical importance by itself; consider effect size and confidence intervals in addition to p-values.
Connections to broader concepts and real-world relevance
The t-distribution arises due to estimating the population SD from the sample, which introduces extra uncertainty captured by df.
The standard error reflects sampling variability; larger n reduces SE and makes it easier to detect smaller differences.
APA formatting considerations reflect discipline-wide reporting standards and clarity in conveying results to readers.
The two-sample t-test framework applies to a wide range of comparisons in psychology and social sciences (e.g., group differences in behavior, performance, or physiology).
Ethical and practical implications: choosing appropriate alpha levels balances the risk of false positives and false negatives; stricter alpha (e.g., 0.01) reduces false positives but requires stronger evidence to declare a difference.
Quick recap of key terms
Raw score (X-score): the original measurement value.
t-score: standardized score in units of the standard error.
Standard error (SE): the standard deviation of the sampling distribution of the mean, SE=ns.
Degrees of freedom (df): the sample size parameter that influences the shape of the t-distribution.
p-value: probability, under H0, of obtaining a test statistic as extreme or more extreme than observed.
Alpha (\alpha): pre-specified threshold for deciding whether to reject H0.
Critical region: values of the test statistic that lead to rejection of H0 for a given \alpha and df.
APA format conventions: italicize statistical symbols, report t and p with df, etc.