Current Diagnostic Criteria for Testosterone Deficiency (Notes)

Introduction

  • It is 80 years since testosterone (T) therapy began and ~20 years into the modern era, yet uncertainty remains about diagnosing testosterone deficiency (TD/hypogonadism).
  • Real-world cases illustrate a “grey zone” where patients have symptoms but no clear threshold below which treatment is indicated.
  • The lack of a evidence-based probability that a given T value predicts treatment response leads to potential under- or over-treatment.
  • The core message: diagnostic criteria for TD are inadequate and need rethinking to focus on likelihood of symptomatic response rather than strict normal/abnormal thresholds.

The current status of diagnostic criteria

  • Historically, TD diagnosis relied on clinical presentation; now most TD cases are diagnosed primarily by blood test results (total T).
  • Thresholds used in guidelines vary:
    • US FDA and AUA: threshold around >300\ \text{ng/dl} (>10.4\ \text{nmol/L}).
    • European/Urology guidelines: threshold around >348\ \text{ng/dl} (>12.1\ \text{nmol/L}).
    • Some experts use >400\ \text{ng/dl} (>13.9\ \text{nmol/L}).
  • Some studies suggest a threshold near the Framingham 2.5th percentile: 348 ng/dl348\ \text{ng/dl} (12.1 nmol/L12.1\ \text{nmol/L}).
  • The 2018 Endocrine Society guidelines proposed harmonized reference ranges with a 2.5th percentile of 264 ng/dl264\ \text{ng/dl} (9.2 nmol/L9.2\ \text{nmol/L}).
  • Clinical utility of these reference ranges is unknown; thresholds remain arbitrary.
  • EMAS data suggest a combined diagnostic signal: symptoms plus a T value around 11 nmol/L11\ \text{nmol/L} (316 ng/dl316\ \text{ng/dl}) and calculated free T (cFT) < 220 pmol/L220\ \text{pmol/L} (64 pg/mL64\ \text{pg/mL}).
  • Lab reference values vary widely across laboratories: in a survey of 25 US labs, 17 provided different reference ranges with about 300%300\% variation; same result can be labeled normal or low.
  • Guidelines on whether to use free T (FT) are mixed:
    • EAU: use FT when discrepancy exists between T and symptoms.
    • Endocrine Society: measure FT particularly when SHBG abnormalities are suspected.
    • AUA: does not specifically mention FT.
  • The current system emphasizes total T thresholds despite known limitations.

Why has it been so hard to establish a threshold?

  • TD symptoms occur across a wide range of serum T concentrations.
  • Contributing factors for the variability include:
    • Significant inter-individual variation in biology among men with similar T levels.
    • Unclear role of the magnitude of T decline over time versus a single absolute value.
    • Genetic differences in androgen receptor sensitivity (e.g., CAG repeats).
    • Total T is a flawed proxy because SHBG binding affects the readily available free/biologically active testosterone.

The problem with total testosterone

  • Testosterone is lipophilic; 40–60% is bound to SHBG, rendering it unavailable to tissues.
  • Higher SHBG can yield higher total T even when free T (FT) is low, leading to misleading conclusions about androgen status.
  • Observational data show that many symptomatic men have low FT despite normal total T, and those with low FT respond to therapy even if total T is above common thresholds.
  • Symptom burden and treatment response correlate better with FT (or bioavailable T) than with total T.
  • Practical implications:
    • Total T thresholds may misclassify patients; FT should be considered, especially when SHBG is abnormal.
    • Guidelines diverge on FT use; the Endocrine Society favors FT assessment when SHBG issues are suspected; EAU suggests FT in discordant cases; AUA omits FT.
  • Future approach: focus on likelihood of symptomatic response to treatment rather than rigid normal/abnormal thresholds.

Other diagnostic rules

  • Various procedural rules exist (e.g., repeat testing on different days, fasting status, early-morning sampling) but there is no strong evidence that these steps improve diagnostic accuracy or predict treatment response.
  • Current emphasis should be on whether tests provide better predictive value for symptomatic improvement rather than on rigid procedural defaults.

What studies are needed?

  • The core question: can we identify a human biomarker (ideally FT) that reliably indicates androgen status and predicts treatment response?
  • Recommendations from Morgentaler et al.:
    • FT is the most accurate indicator of androgen status; treat symptomatic men with low FT regardless of total T value.
    • Calculated FT is a practical measure using total T and SHBG (albumin included in equations; albumin has little influence on cFT).
    • Direct FT (analog assay) correlates well with equilibrium dialysis (EqD) and calculated FT, but uses a different scale.
  • Key proposed study: enroll men with defined TD-like symptoms, provide testosterone treatment regardless of baseline T, and assess the likelihood of symptomatic response as a function of baseline measurements of total T, FT, and bioavailable T (the most predictive metric will drive clinical use).
  • Goal: move from arbitrary thresholds to evidence-based decisions that estimate the probability of benefit from therapy.
  • Overall: a shift toward diagnosing TD based on solid evidence of likely benefit rather than adherence to unreliable thresholds.

References (contextual, not required for exam)

  • Historical and guideline sources explore the evolution of thresholds and the role of FT in diagnosis and management.
  • Data emphasize the variability of laboratory reference ranges and the superior relevance of free testosterone in correlating with symptoms and treatment response.