Signal Detection Theory and Z-scores — Study Notes
Z-scores, standard deviation, and why they matter
- Z-score intuition: a z-score tells you how far an observation is from the mean in units of standard deviation. It’s the ratio of the deviation to the data’s variability.
- Concept: for any data point x, with distribution mean μ and standard deviation σ, the z-score is z=σx−μ.
- When we convert measurement differences into z-scores, units cancel out, giving a unitless, comparable measure.
- Why standard deviation is fundamental:
- It quantifies how much your data vary trial-to-trial.
- A big difference between conditions is more meaningful if variability (σ) is small; a big σ can make the same difference look trivial.
- Statistics often compare an effect size (difference) to variability (noise) to judge meaningfulness.
- Practical note from the lecture:
- You don’t necessarily need to calculate σ on an exam, but you should understand what σ (the spread) represents and why a standard deviation matters conceptually.
- Z-scores let you compare effects across different measurement units (milliseconds, volts, etc.).
Introduction to Signal Detection Theory (SDT)
- Core idea:
- There are real physical events (the stimulus) and perceptual decisions our noisy nervous system makes about them.
- Perception is noisy; we cannot measure the stimulus directly in the brain, only via behavior.
- Reality vs. perception:
- Reality: the target is either present or absent (binary).
- Perception: we may perceive it or not, with errors due to internal noise and external conditions.
- The normal curve premise:
- When information from a stimulus is processed by the brain, the resulting perceptual strength is assumed to be normally distributed across trials.
- Noise alone gives a distribution; the presence of a signal shifts that distribution upward (signal+noise).
- Two distributions:
- Noise distribution: represents perceptual strength with no target.
- Signal+Noise distribution: represents perceptual strength with the target present.
- The four possible outcomes (yes/no task):
- Hit: target present and reported present.
- Miss: target present but reported absent.
- False alarm: target absent but reported present.
- Correct rejection: target absent and reported absent.
- Decision criterion (the “threshold”):
- A single criterion (threshold) is used to decide between “target present” and “target absent” on each trial.
- If perceptual strength exceeds the criterion, respond “present”; otherwise respond “absent.”
- Why this matters:
- Different people can have the same ability to distinguish signals (same info extraction) but different response biases (tendency to say yes or no).
- SDT provides metrics that separate perceptual sensitivity from response bias.
Yes/No task and SDT geometry
- Target present vs absent trials:
- Noise-only trials (absent trials) generate False Alarms and Correct Rejections.
- Target-present trials generate Hits and Misses.
- Visual intuition (two overlapping distributions):
- Noise distribution sits below, signal+noise sits shifted to the right (toward higher perceptual strength).
- The decision criterion sits somewhere along the information axis; moving it changes the balance of Hits and False Alarms without changing the underlying distributions.
- What changes with bias:
- A liberal bias (lower criterion) increases Hits but also increases False Alarms.
- A conservative bias (higher criterion) reduces False Alarms but also reduces Hits.
- Practical takeaway:
- Accuracy alone mixes sensitivity and bias; SDT aims to separate these components.
Two key SDT metrics: d′ and criterion c
- Hit rate and False Alarm rate:
- Hit rate: H = \frac{\text{Hits}}{\text{N_present}}
- False alarm rate: FA = \frac{\text{False Alarms}}{\text{N_absent}}
- Z-transform of rates (assuming normal distributions):
- z(H)=the z-score corresponding to the cumulative probability H
- z(FA)=the z-score corresponding to the cumulative probability FA
- d′ (d-prime): perceptual sensitivity (distance between the two distributions in SD units)
- Definition: d′=z(H)−z(FA)
- Interpretation: larger d′ means greater separation between noise and signal+noise distributions; better perceptual discrimination.
- Criterion c (response bias):
- Definition (common convention): c=−21(z(H)+z(FA))
- Interpretation: negative c = liberal bias; positive c = conservative bias; zero = unbiased (balanced) criterion.
- Relationship between d′ and c:
- d′ measures perceptual sensitivity independent of bias.
- c captures the decision bias (where the criterion lies relative to the two distributions).
- How to interpret a fixed d′ when bias changes:
- If you shift the criterion (bias) but keep the same underlying sensitivity, d′ remains the same.
- Accuracy can go up or down with bias even if d′ stays constant.
Worked examples (yes/no task)
- Example 1 (200 trials: 100 present, 100 absent):
- Hits = 80; False Alarms = 35
- Hit rate: H=0.80; FA rate: FA=0.35
- Compute z-scores: z(H)=Φ−1(0.80)≈0.842, z(FA)=Φ−1(0.35)≈−0.385
- d′: d′=z(H)−z(FA)≈0.842−(−0.385)≈1.23
- c: c=−21[z(H)+z(FA)]≈−21(0.842−0.385)≈−0.23
- Interpretation: d′ ≈ 1.23 (moderate-to-good sensitivity); c ≈ -0.23 (liberal bias: more willing to say “present”).
- Example 2 (200 trials: 100 present, 100 absent):
- Hits = 75; False Alarms = 22
- Hit rate: H=0.75; FA rate: FA=0.22
- Compute z-scores: z(H)=Φ−1(0.75)≈0.674, z(FA)=Φ−1(0.22)≈−0.772
- d′: d′=z(H)−z(FA)≈0.674−(−0.772)≈1.45
- c: c=−21[z(H)+z(FA)]≈−21(0.674−0.772)≈0.049
- Interpretation: d′ ≈ 1.45 (greater sensitivity than Example 1); c ≈ 0.05 (slightly conservative bias).
- Key takeaway from the examples:
- d′ values reflect perceptual information strength; higher d′ means better discrimination.
- c reflects bias toward saying “present” or “absent.”
- It is possible for a person to have higher accuracy with a stronger bias that makes more hits but also more false alarms; d′ is not affected by this bias.
Edge cases and practical adjustments
- When hit rate or false alarm rate is 0% or 100%:
- Directly converting 0% or 100% to z-scores is problematic (they map to ±∞).
- Common correction: add a small imaginary/trial correction by adding 0.5 to each cell in the 2x2 table (present/absent × target/response), effectively making Npresent and Nabsent incremented by 1 each and adjusting both H and FA slightly.
- Rationale: prevents infinite z-scores and yields a finite d′; should be applied to all participants consistently and planned before data collection.
- Negative d′ (theoretical only):
- d′ < 0 would imply you are systematically responding in the wrong direction (confusing noise for signal more often than signal for noise) in literal terms.
- In normal perception tasks, d′ < 0 is rarely meaningful unless describing an illusion or reversed task; typically, we interpret d′ as >= 0.
- Alternative measures when variances differ: d′ assumes equal-variance normal distributions.
- When this assumption is violated, an approach called d′-of-a (d′a) or ROC-based methods (e.g., d′ with unequal variances) can be used.
- d′a comes from fitting an ROC curve with rating data and can handle unequal variances between noise and signal+noise distributions.
Rating scales and ROC methods
- Beyond binary yes/no responses, you can collect confidence or multiple response categories (e.g., definitely present, probably present, guess present, guess absent, probably absent, definitely absent).
- Each category yields a different hit and false alarm rate, enabling multiple z-scores and an ROC curve with more points.
- You can compute d′ using these points and fit a line (ROC) to summarize sensitivity.
- Benefits of rating-based SDT:
- Provides richer data (more than a single hits/FA pair).
- Allows modeling of unequal variances (via d′a) and a more nuanced view of decision processes.
- Practical example from research:
- A memory strength study used rating-scale SDT to examine how memory traces differ under conditions; d′a was used to account for unequal variances between memory strengths.
Applications and extensions of SDT concepts
- Bias-free memory and perception:
- d′ is a measure of perceptual or memory sensitivity independent of bias (criterion c).
- Higher d′ implies stronger discriminability or memory strength, regardless of where the respondent tends to say “present.”
- Real-world and research examples mentioned in the lecture:
- Social psychology and stereotypes: a yes/no task using names (e.g., NBA player names) to study bias and perception; a bias shift can manifest as changes in d′ or in criterion depending on context.
- Stereotype bias can lead to a shifted criterion (pseudo-d′), illustrating that bias can affect response tendencies even when underlying sensitivity is constant.
- How SDT connects to broader science and practice:
- Provides a principled framework for separating perceptual/memory strength from decision criteria.
- Helps interpret performance changes due to environment (noise) vs. instruction/goal (criterion shifts).
- Useful across domains: perception, memory, psychometrics, clinical decision-making, and even weather/event detection.
Quick recap and takeaways
- Z-scores convert raw performance into unitless measures that reflect how far observed signals are from noise, in SD units.
- Signal Detection Theory decomposes performance into two components:
- Sensitivity (d′): how well you can distinguish signal from noise.
- Bias/criterion (c): your default tendency to say “present” vs. “absent.”
- Key formulas:
- Hit rate: H = \frac{\text{Hits}}{\text{N_present}}
- False alarm rate: FA = \frac{\text{False Alarms}}{\text{N_absent}}
- d′=z(H)−z(FA)
- c=−21(z(H)+z(FA))
- Examples illustrate how d′ and c can diverge: higher d′ means better discrimination, while c reflects liberal vs. conservative response style.
- Practical data issues: zero/one rate corrections, potential unequal variances (d′a), and the use of rating scales to build ROC curves.
- SDT concepts extend beyond simple perception tasks to memory strength, stereotypes, and many decision-making contexts.