Dr. Chris Bundell, Clinical Immunology, PathWest Laboratory Medicine
21st March 2024
Overview
EBLM: Evidence-Based Laboratory Medicine
NATA Accreditation Requirements
Reference Intervals
Outliers
Examples of Establishing Reference Intervals
Evidence-Based Pathology and Laboratory Medicine (EBLM)
Definition: "The conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients."
Source: Sackett D. Evidence-based medicine. Lancet. 1995;346:1171.
Purpose of Test Investigation
Rule-in diagnosis
Rule-out diagnosis
Assess prognosis
Start intervention or treatment
Adjust intervention or treatment
Stop intervention or treatment
Assess efficacy of a treatment
Assess compliance
Review of Literature
Highly Ranked Articles: Impact Factor
Type of Studies: Randomized Control Study
Patient Groups Selected: Relevance to local population
Power of the Study: Sufficient individuals in the study group to show a clear statistical difference between patients and controls
If an effect (of a specified size) really occurs, what is the chance that an experiment of a certain size will find a "statistically significant" result?
Studies Reported - Questions to Ask
Is it important?
How likely are the outcomes over time?
How precise are the prognostic estimates?
Does the evidence show a significant impact on managing the disease in question?
Will the information assist in the treatment decisions of the clinical staff?
Study Population - Questions to Ask
Are the patients in the study similar to the patient group to which the physician is applying the evidence?
Ethnicity
Age
Socioeconomic background
Ask a question that has a measurable outcome
Validity of Evidence - Questions to Ask
Were the patients assembled for the study at the same point of the disease?
Can it be applied to individual patients?
Was the follow-up period sufficiently long and complete?
Were the results validated with a group of test (holdout) cases?
Sources of Bias
Selection bias (samples of convenience and others)
Sample size
Ratio between the number of observations and the number of variables
Characteristics of the control group (healthy cohort effect)
Performance bias
Attrition bias
Detection bias
Distribution of the data (normal vs skewed)
Interpretation of the results
Lack of independent validation group
Publication bias
Measures of Difference
Purpose: Search for a statistically significant difference between the reference control group and the test group for the variable of interest
Methods:
Descriptive statistics (mean, standard deviation, frequency)
Analysis of variance
t-test (continuous variable)
ANOVA (Continuous variable)
Chi-square (categorical data)
Rank tests (Mann Whitney U Test)
Measure of Statistical Difference
Null Hypothesis: There is no difference between the two populations
P-value: A measure of the probability that an event or parameter measured from a study group is significantly different from the value in the control group.
If a p value is smaller than some arbitrary cut-off value, e.g. p < 0.05, the null hypothesis is rejected in favour of the alternative.
A p < 0.05 value indicates that there is a 5% probability that the null hypothesis was rejected by spurious factors other than those being tested in the study.
Testing in the Diagnostic Lab
Collaboration between Laboratory Process and Clinical Teams
EBLM at the Bench
Responding to inquiries from clinicians and carers
Introducing a new test
Decommissioning an old test or another part of the service
Performance, management of, and quality improvement in current services
Research and development and strategic planning
Relevant Terms
Validity: The ability of the test to distinguish between those with the disease and those without.
Sensitivity: The ability to identify those that do have the disease.
Specificity: The ability to correctly identify those who do not have the disease.
Positive Predictive Value: If the result of the test is positive, what is the probability that the patient has the disease?
Positive Predictive Value = \frac{TP}{TP + FP}
Negative Predictive Value: If the result of the test is negative, what is the probability that the patient does not have the disease?
Negative Predictive Value = \frac{TN}{FN + TN}
Influenced by the prevalence of the disease
Other Terms:
Pre-test probability
Post-test probability
Accuracy
Precision
Establishing Reference Intervals
Review of the published guidelines
Examples of setting reference intervals
Reference Intervals Definition
Range of measurements for a specific analyte from a population of representative healthy individuals.
Specified interval of the distribution of values taken from a biological reference population (NATA AS Iso 15189 2023).
Decision Level/Limit
Particular cut-off value for an analyte that enables individuals with a disorder or disease to be distinguished from those without the disorder or disease.
Certain tests have National Guidelines defining a “good” value, e.g., HbA1c for diabetic control.
In these cases, there is no need to establish a reference interval for the analyte.
NATA Requirements for Reference Intervals
ISO 15189 standard specifies that:
Reference intervals or limits must be included with the result report.
The Laboratory must have a documented and monitored Quality System in place that covers information about the laboratory’s reference intervals.
Reference values should be established by the laboratory OR verified by the laboratory on the local patient population.
Guidelines for Establishing Reference Ranges
EP28-A3c Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory
Approved Guidelines – 3rd Edition, Published in 2010
A cross-sectional whole population health survey which included the collection of sera and DNA samples.
The Western Australian Pregnancy Cohort (Raine) Study:
Prospectively collected cohort of pregnancy, childhood, adolescence, and now early adulthood to be carried out anywhere in the world. The cohort was established between 1989 and 1991.
Demographics of a Subset of the Busselton Reference Population Cohort
Males: Total number, n (%) = 102 (51.5), Mean age ± SD, years = 51 ± 17
Females: Total number, n (%) = 96 (48.5), Mean age ± SD, years = 50 ± 17
Age group, years, n (%):
<30: 14 (7.1), 11 (5.5)
30-50: 33 (16.7), 38 (19.2)
50-70: 40 (20.2), 30 (15.2)
>70: 15 (7.6), 17 (8.6)
Country of birth, n (%):
Australia: 79 (39.9), 83 (41.9)
Northwest Europe: 2 (1.0), 10 (5.1)
Other: 17 (8.6), 2 (1.0)
Not stated: 4 (2.0), 1 (0.5)
Profile of Life Blood Donors
Distribution of blood donors in Australia by age group and sex (2021, 2023)
Healthy Blood Donors (2021 vs 2023)
Proportion of Males and Females: 50% / 50%
Donors >50 years of age:
Median Age Male: 42 / 44
Median Age Female: 38 / 43
Next Steps
Analyze reference data
Identify possible data errors and outliers
Document all of the above
Parametric Analysis
For normally distributed data:
Does the data have a Gaussian distribution?
Visual inspection
Evaluation of skewness/kurtosis
Chi-squared (goodness of fit) test
Kolmogorov-Smirnov test
Mean (\bar{x}) ± 1.96 x Std Deviation 95% results
Reference limits
2.5th percentile = \bar{x} – 1.96 SD
97.5th percentile = \bar{x} + 1.96 SD (Rounded to 2 Standard deviations)
Upper and lower limit of immunoglobulins:
IgG . . . . . 3.3-11.6 g/L
IgA . . . . . 0.14-1.10 g/L
IgM . . . . 0.41-1.62 g/L
Non-Parametric Analysis
Data: Non-Gaussian distribution, the central 95% of the data can be determined by ordering the array from the lowest to the highest values and eliminating the lowest and highest 2.5% = rank order analysis
CLSI recommends:
Sample size (minimum) = 120
Ranked according to magnitude and reference limits calculated as lower 2.5th percentile and upper 97.5th percentile
i.e., lowest and highest 3 values are eliminated
One-Sided Reference Interval
If clinical interest is only in “low” or “high” results, one-sided intervals exclude only the 5% of the population in the “abnormal” tail of the distribution
Anti CCP < 7 U/ml
Anti MPO < 3.5U/ml
Anti PR3 < 2U/ml (Run on the Immunocap)
Data transformation:
Non-Gaussian data can be transformed into normally distributed data
Example – linear to log transformation
If data looks Gaussian then treat as parametric
Rank Order Calculation
Example:
Smallest value: r = 1
Largest value: r = n
Lower 2.5th percentile: r_1 = 0.025(n + 1)
Upper 97.5th percentile r_2 = 0.975(n + 1)
Therefore, if n = 120:
r_1 = 0.025(120 + 1) = 3
r_2 = 0.975(120 + 1) = 118
Analysis of Data: Detection of Outliers
Assume that measured reference values represent a “homogeneous” collection of observations
Some reference values arise from a different population of test results
Easily identifiable as outliers – lie well outside the majority of reference values
Outliers – Retain or Reject?
Retain outliers unless there is known to be an aberrant observation, e.g., an analytical error
Statistical techniques for identifying an outlier:
Dixon’s test
Block procedure
Tukey’s 2-stage procedure (Gaussian data)
NB. If an outlier is rejected, remaining data needs to be re-tested for additional outliers
Dixon’s Test or “Reed Rule”
Calculation of the D/R ratio:
D = absolute difference between the extreme observation and the next observation
R = range of all observations, including extremes
Interpretation:
D/R ≥ 1/3 reject result
D/R < 1/3 retain result
Example:
D = 20 – 17 = 3; R = 20 – 5 = 15
D/R = 3 / 15 = 0.2 retain
Block Procedure
If 2 or 3 outliers exist on one side of the distribution:
Apply the D/R rule to the least extreme outlier
If this value is rejected all rejected
If this value is retained all retained or apply a test that considers all outliers together
Tukey’s Procedure
Considers all outliers together
Reduces / eliminates the potential masking effect of multiple outliers on one side of the distribution
Appropriate only if the data has Gaussian distribution (nb. can transform non-Gaussian data)
Uses middle 50% of sample
Calculate Q1 (25th centile) and Q3 (75th centile) of the data set
IQR (interquartile range) = Q3 – Q1
Lower boundary = Q1 – 1.5 x IQR
Upper boundary = Q3 + 1.5 x IQR
Only those values between the lower and upper boundaries are included
Assay Validation and Verification
Validation
Means the process of defining an analytical requirement and confirming that the method under consideration has performance capabilities consistent with that requirement.
Verification
Means procedures to test to what extent the performance data obtained by the manufacturers during method validation can be reproduced in the environment of the end-user.
Validation Study – Small Number of Reference Individuals (Transference Study)
Laboratory’s test population, n = 20
Need to:
Satisfy original exclusion and partitioning criteria
Statistically homogenous group (i.e., no outliers)
Specimen Required: 5 mL clotted blood Central Sydney Container: Gold Top Tube
Busselton Samples Tested for sIgG to Avian Precipitins and Aspergillus
Aspergillus Reference values:
<60 Neg
60 – 80 Equivocal
>80 Positive
MuSK Antibody
Background: Pre 2022 Kit provided as components without a reference value
Reference value established as the mean of 10 non-Myasthenia gravis patient samples
Reference Value = <0.014
March 2022 Reagents provided as a kit registered IVD with a reference value of Negative <0.05 nmol/L
Evidence for review: reported positive result inconsistent with clinical presentation
LOW Positives 2023 - 2024
Data from 2023 – 2024 assessed
PPV is 10% (proportion of patients with positive test that have possible MUSK + myasthenia gravis)
New Reference Intervals
Need to periodically review reference intervals in all laboratories
Method changes, New analytes
Who needs to know if a reference intervals changes?
How is this advised?
Test directory, GP information, Document notice
Clearly stated on result reports
QAP program
Notification is required for NATA accreditation
Summary
Reference intervals are important for the differentiation between healthy and unhealthy individuals
Reference cohorts may not be readily available (n = 120)
Partition of intervals may be required
Sample size is important to ensure the Reference Interval is representative of the population
Transference and Verification:
Quoted reference intervals need to be validated/verified for the local population
Involves a smaller number of samples
Statistical methods need to take into account the characteristics of the data and any outliers.
Introducing a New Assay
IVD Legislation Requirements (Therapeutic Goods Administration and National Authority of Testing Accreditation)
Laboratory Validation Requirements
In House In Vitro Diagnostic (IVD)
within the confines or scope of an Australian Medical Laboratory or Australian medical laboratory network:
developed from first principles;
developed or modified from a published source;
developed or modified from any other source;
used for a purpose other than the intended purpose assigned by the manufacturer
not supplied for use outside that medical laboratory or medical laboratory network.
Definition of In Vitro Diagnostic Medical Device (IVD)
means a medical device that is:
a reagent, calibrator, control material, kit, Specimen receptacle, software, instrument, apparatus, equipment or system, whether used alone or in combination with another diagnostic product for in vitro use; and
intended by the manufacturer to be used in vitro for the examination of a specimen derived from the human body, solely or principally for:
giving information about a physiological or pathological state or a congenital abnormality; or
determining safety and compatibility with a potential recipient; or
monitoring therapeutic measures
Essential Principles
Compliance with relevant essential principles ensures that use of the IVD does not compromise the health or safety of patients, users, or any other person, and that benefits arising from the use of the IVD outweigh the risks.
The essential principles identify performance levels required, hazards to be addressed, or issues to be considered but do not necessarily specify how the principles can be satisfied or complied with.
Manufacturer's responsibility to demonstrate that their IVD complies with the relevant essential principles. Justification must be provided for any specific principle that the manufacturer considers is not applicable.
In-House IVDs
In-house IVDs are separated into two groups for the purposes of determining the appropriate conformity assessment procedure:
Class 1-3 in-house IVDs
Class 4 In-house IVDs.
Regulatory Review and Level of Risk
The degree of regulatory review an IVD undergoes is determined by assessing the risk posed to the health of the individual or to the public through the use of that IVD.
Classification rules take into account the likelihood of harm and the severity of that harm.
IVD’s are assigned to one of four risk categories, as follows:
Class 1– No public health risk or low personal risk
Class 2 – Low public health risk or moderate personal risk
Class 3 – Moderate public health risk or high personal risk
Class 4 – High public health risk (HIV testing, transfusion medicine testing).
Summary of EBLM (1)
EBLM is:
asking questions when odd things turn up
applying scientific principles to investigations
questioning the basis of clinical papers and the population they are testing
understanding what the client needs to do their job
continual improvement in proving relevant clinical information as medicine advances
Summary of EBLM (2)
EBLM is important to ensure testing is carried out that is:
relevant
Informative
Based on appropriate test cohorts
Control cohort reflects the patient group to which the test will be applied.
The question formulated should be written in such a way that there are measurable outcomes