t-Tests and Inference: One-Sample and Independent-Samples (Comprehensive Notes)

One-Sample t-Test: concepts and interpretation

Purpose: tests whether the mean of a single sample differs from a known population mean μ.
Key language shift: raw scores (X scores) vs. t-scores. The t-score is a standardized value expressed in units of standard error.
Raw score example from transcript:
- Population mean μ = 40
- Sample mean X̄ = 42
- Sample size n = 36
- Reported t ≈ 2.03 (positive because X̄ > μ)
Formula for the one-sample t-statistic:
$t = \frac{\overline{X} - \mu}{\dfrac{s}{\sqrt{n}}}$
where $s$ is the sample standard deviation and $SE = \dfrac{s}{\sqrt{n}}$ is the standard error of the mean.
Interpretation of the t-score:
- Positive t indicates the sample mean is above the population mean.
- Negative t indicates the sample mean is below the population mean.
- The population mean has t = 0 (zero standard errors away from itself).
Degrees of freedom (df):
$df = n - 1$
Rationale: use the sample SD to estimate the population SD; subtract 1 to reserve degrees of freedom for variability in estimation.
P-value and decision rule:
- For a two-tailed test, the p-value is the proportion of the sampling distribution as extreme or more extreme than the observed t in either tail:
 $p = P\left(|T| \ge |t{obs}| \mid H0\right)$
- Example from transcript: observed t = 2.03, with reported p ≈ 0.042 (two-tailed).
- Alpha (significance level) choice: commonly $\alpha = 0.05$ ; sometimes $\alpha = 0.01$ (e.g., FDA drug trials).
- Decision rule:
- If p < \alpha, reject the null hypothesis (conclude a difference exists).
- If $p \ge \alpha$ , retain (fail to reject) the null hypothesis (no evidence of a difference).
Reporting results in APA format:
- Statistical symbols in italics: T and P (and the test statistic type).
- Include degrees of freedom after the T (e.g., t(35)).
- Report the test statistic and p-value: e.g., $t(35) = 2.03,\quad p = .042$
Conceptual interpretation recap:
- The t-test assesses whether the sample mean is a significantly different value from the population mean.
- “Retaining the null” means the two things are the same or not significantly different.
- “Rejecting the null” means the two things are significantly different.
Visualization intuition:
- The sampling distribution of the mean under H0 is centered at μ with standard error SE.
- The observed t-score marks a cut-off point; areas beyond ±t_obs form the critical region for a given α.
- The p-value is the total area in the tails beyond the observed t in both directions for a two-tailed test.
Example recap from transcript:
- Population mean μ = 40; sample mean X̄ = 42; n = 36; t = 2.03; p ≈ 0.042; α = 0.05 → reject H0; conclude there is a difference between sample and population means.
Connection to prior ideas:
- t scores express how far the sample mean is from the population mean in units of SE, linking to the concept of standard error and the sampling distribution.
- The null hypothesis and its rejection are evaluated via a distribution of possible sample means under H0.

The Language of X scores vs. t scores: practical notes

X scores: raw measurements (e.g., extraversion score on a scale).
T scores: transformed into a standardized metric using the standard error; tell you how many SEs the sample mean is from the population mean.
Transformation intuition: converting to t scores allows comparison across studies with different scales, because you’re measuring deviation in units of SE rather than raw units.
Example interpretation:
- If a test on extraversion scale has population mean μ = 40 and the sample mean is X̄ = 42 with SE = 1, then $t = (42 - 40)/1 = 2.0$ , indicating the sample mean sits 2 standard errors above μ.
Practical steps you’ll perform (in practice, often by computer):
- Compute t, df, and p-value; decide to reject or retain H0; report results.
- In lab/PSPP outputs, the computer supplies t, df, and p-value; you then format the result for publication.

Two-tail vs. one-tail tests and the p-value interpretation

Two-tailed test: tests whether the two means differ (either direction).
- Alpha split across tails: $\alpha / 2$ in each tail (e.g., 0.025 per tail for $\alpha = 0.05$ ).
- The p-value reflects both tails: the probability of being as extreme or more extreme in either direction.
One-tailed test: tests for difference in a specified direction (not used in the transcript but often taught).
In this lecture, emphasis is on two-tailed interpretation and the symmetrical t distribution.

Independent-samples t-test: concept and workflow

Purpose: compare the means of two independent groups on a numeric dependent variable.
Key structure:
- Independent variable: nominal with exactly two levels (e.g., work vs not work; meat-eaters vs vegetarians).
- Dependent variable: numeric (interval or ratio scale).
Example variants discussed:
- People who work vs people who don’t: numeric outcome could be hours of productivity at work or another measurable metric.
- Meat-eaters vs vegetarians: numeric outcome could be muscle mass or another measurable trait.
Two-group design intuition:
- The independent variable partitions the sample into two groups.
- The t-test assesses whether the two group means differ on the numeric outcome.
Classic example from Kitona (1940) on memorization vs understanding:
- Two learning methods: memorization vs learning by understanding.
- Benefit observed for understanding method in a memorization task (the matchstick problem).
- Design: there were multiple problems (e.g., 12 problems like the matchstick task) to compare learning methods.
- Finding: learning by understanding yielded better outcomes than memorization, illustrating a difference between two learning approaches.
Reporting and interpretation basics parallel the one-sample case but use a pooled or alternative variance estimate depending on equal-variance assumptions (noted in practice with more advanced variants).
Example outcomes and reporting follow APA conventions similar to the one-sample case but with two sample means and the corresponding df.

Formulas and key statistical details to memorize

One-sample t-test statistic: $t = \frac{\overline{X} - \mu}{\dfrac{s}{\sqrt{n}}}$
- Degrees of freedom: $df = n - 1$
- Standard error: $SE = \dfrac{s}{\sqrt{n}}$
Two-sample independent t-test statistic (common version with pooled variance):
- Pooled variance:
 $sp^2 = \frac{(n1 - 1)s1^2 + (n2 - 1)s2^2}{n1 + n_2 - 2}$
- Test statistic:
 $t = \frac{\overline{X}1 - \overline{X}2}{\sqrt{ sp^2 \left( \frac{1}{n1} + \frac{1}{n_2} \right) }}$
- Degrees of freedom:
 $df = n1 + n2 - 2$
P-value for two-tailed tests:
$p = P\left( |T| \ge |t_{obs}| \right)$
Critical regions (for a two-tailed test):
- Cutoffs at $\pm t_{\alpha/2, df}$ , where the p-value is the area beyond these cutoffs in both tails.
Reporting conventions (APA style):
- Use italics for statistical symbols: T, P; include the degrees of freedom after T (e.g., t(35)).
- Typical reporting format: $t(df) = t{obs},\quad p = p{value}$
- Example: $t(35) = 2.03,\quad p = .042$

Practical interpretation and exam-ready tips

Always state the null hypothesis clearly: there is no difference between the two means.
Interpret t-score directionally: positive t indicates the sample mean is greater than the comparison mean; negative t indicates it is smaller.
Decide on alpha ahead of time (e.g., 0.05); understand how two-tailed tests split alpha across both tails.
Remember the logic: small p-value means the observed extreme difference would be unlikely if the null were true; thus reject H0.
Note: The relationship between t, p, and alpha is a decision rule, not a claim of practical importance by itself; consider effect size and confidence intervals in addition to p-values.

Connections to broader concepts and real-world relevance

The t-distribution arises due to estimating the population SD from the sample, which introduces extra uncertainty captured by df.
The standard error reflects sampling variability; larger n reduces SE and makes it easier to detect smaller differences.
APA formatting considerations reflect discipline-wide reporting standards and clarity in conveying results to readers.
The two-sample t-test framework applies to a wide range of comparisons in psychology and social sciences (e.g., group differences in behavior, performance, or physiology).
Ethical and practical implications: choosing appropriate alpha levels balances the risk of false positives and false negatives; stricter alpha (e.g., 0.01) reduces false positives but requires stronger evidence to declare a difference.

Quick recap of key terms

Raw score (X-score): the original measurement value.
t-score: standardized score in units of the standard error.
Standard error (SE): the standard deviation of the sampling distribution of the mean, $SE = \dfrac{s}{\sqrt{n}}$ .
Degrees of freedom (df): the sample size parameter that influences the shape of the t-distribution.
p-value: probability, under H0, of obtaining a test statistic as extreme or more extreme than observed.
Alpha (\alpha): pre-specified threshold for deciding whether to reject H0.
Critical region: values of the test statistic that lead to rejection of H0 for a given \alpha and df.
APA format conventions: italicize statistical symbols, report t and p with df, etc.