Proficiency Testing & Error Rates in Fingerprint Analysis

Introduction: Proficiency Testing & Training Context

• Central questions raised: How do we establish whether a practitioner is truly competent and able to perform reliably?
• Distinction made between being TRAINED and being PROFICIENT.
• All Forensic Identification Officers (FIOs) receive an initial block of training lasting 9 weeks at either:
– Ontario Police College (OPC)
– Canadian Police College (CPC)
• Fingerprint Identification Assistants / Scenes-of-Crime Officers (FIAs/SOCOs) complete shorter courses of 2\text{–}3 weeks.
• Continuous professional development occurs both internally (agency-run) and externally (courses, conferences, workshops, etc.).
• Proficiency testing is introduced after the initial training phase and recurs throughout a career.

Police Colleges & Initial Training Pathways

• Two principal Canadian training centres were pictured (slide showed logos/buildings):
– Ontario Police College (Aylmer, ON)
– Canadian Police College (Ottawa, ON)
• Serve as gateways for foundational knowledge, practical labs, and certification in forensic identification.

Formal Proficiency Testing Programs in Ontario & RCMP

• Mandatory recertification for Ontario FIOs every 3 years after initial qualification; administered by OPC.
• RCMP runs its own proficiency testing every 2 years* for its examiners (*footnote on slide).
• Additional/optional tests may be:
– Agency-specific (internal quality-assurance exercises)
– Coordinated through CPC, OPC, or RCMP labs.
• Reference: A formal proficiency-testing document is housed on the CanFRWG (Canadian Forensic Research Working Group) website.

Structure of OPC vs RCMP Proficiency Tests

• OPC Proficiency Test
– Duration: 3 hours.
– Predominantly written / multiple-choice / short-answer questions.
– No hands-on components (e.g., mock scene processing or latent print comparisons).
– Class discussion prompt: “Thoughts on this?” (students encouraged to critique whether a purely theory-based test is sufficient).
• RCMP Proficiency Test
– Duration: 5 hours.
– Largely practical case-work style examinations involving latent-to-tenprint comparisons.
– Stated Objective: Assess an examiner’s ability to arrive at correct conclusions via the ACE-V methodology.

Additional Proficiency Training – Expert Witness Testimony Course

• Delivered by CPC; length 1 week.
• Candidates submit a REAL fingerprint case in which they effected an identification → prepare a written report + curriculum vitae.
• Culminates in a MOCK TRIAL (direct & cross-examination) on that case.
• Discussion prompts: Is a single one-week mock-court course adequate for courtroom competency? Should testimony receive recurring testing analogous to technical skills?

Concept of Error Rates in Fingerprint Examination

• Common courtroom question: “What is your error rate?”
• Historic (problematic) answer from some examiners: 0\% (implying infallibility).
• Modern stance: Error rates exist; multiple empirical studies have attempted to quantify them.
• Cautionary note: Miami-Dade study widely cited yet contains calculation errors.
• No single study has replicated the ENTIRE end-to-end fingerprint process (scene to court); therefore each reported rate is conditional on its experimental design.
• CanFRWG (2019-11-20) guidance: Use error-rate studies to VALIDATE or INVALIDATE a discipline, not to quote absolute operational error rates.

Definitions & Categories of Error

• False Positive Identification: Declaring a person to be the source when they are NOT. (Type I error.)
• False Negative Exclusion: Declaring a person NOT the source when they ARE. (Type II error.)

Two Approaches to Calculating Error Rates

Casework-Based
– \text{Error Rate}=\dfrac{\text{# actual errors detected}}{\text{# cases where comparisons performed}}
– Problems: Ground truth is usually unknown; undetected errors remain invisible.
Controlled Study-Based
– Research team knows the ground truth for each print pair.
– Enables direct computation of false positives/negatives.

Ground Truth – Critical Experimental Requirement

• Defined as information KNOWN to be true through direct observation/measurement (not inference).
• Ensures researchers can unequivocally classify each examiner decision as correct or erroneous.

Contextual Bias Study – Dror, Charlton & Péron (2006)

• Citation: Itiel E. Dror et al., “Contextual information renders experts vulnerable to making erroneous identifications,” FSI 156(1), pp. 74-78.
• Objective: Determine whether latent-print experts can remain objective when exposed to misleading context.
• Methodology
– Re-used fingerprint “matches” made by same experts 5 years earlier.
– Introduced extraneous context: Told participants the pair was involved in the high-profile FBI/Madrid bombing misidentification.
– 5 examiners from multiple countries; no time limit; instructed to ignore background info.
• Results
– 1/5 (20 %) maintained the original ‘match’ conclusion.
– 3/5 (60 %) changed to ‘non-match’ (exclusion).
– 1/5 (20 %) rendered ‘inconclusive.’
• Interpretation
– Strong evidence of susceptibility to CONTEXTUAL bias.
– Illustrates cognitive factors in biometric decision making.
• Limitations & Future Work
– Very small sample size.
– Sparks call for broader, blinded, large-scale studies.

Verification & AFIS-Assisted Study – Langenburg, Hall & Rosemarie (2015)

• Citation: G. Langenburg et al., “Utilizing AFIS searching tools to reduce errors in fingerprint casework,” FSI 257, pp. 123-133.
• Verification Practice
– Traditionally applied only to IDENTIFICATIONS.
– OSAC & former SWGFAST recommend also verifying EXCLUSIONS & INCONCLUSIVES, but agencies often lack resources.
– Frequently non-blind (verifier sees first examiner’s call).
• Study Proposal
– Use AFIS searches on exclusion/inconclusive decisions as a safeguard → may surface candidate matches missed by examiner.
– Especially useful in cold-case work pre-verification.
– Caveats: AFIS is not fool-proof (cf. Mayfield error); human evaluation still essential.

Aggregated Examiner Error Rates Across Studies

• General trend: FALSE POSITIVE rate LOW (<1\%).
• FALSE NEGATIVE (erroneous exclusions) higher: 2.2\% to 7.9\%.
• FBI/Noblis “Black Box” study (detailed below) found 7.5\% false negatives.

FBI/Noblis Black Box Study (2011) – Background & Rationale

• Partnership: FBI Laboratory + Noblis (independent non-profit R&D).
• Aimed to satisfy Daubert factor #3 (known/potential error rate) and provide courts objective reliability data.
• Black-box paradigm: Measure accuracy of final decisions WITHOUT probing internal reasoning (ACE-V steps lumped together).

Daubert Standard Refresher (1993) – Relevance to Fingerprints

Testability of theory/technique.
Peer review & publication.
Known or potential error rate.
Standards controlling operation.
General acceptance in scientific community.
• U.S. v. Mitchell (2004) highlighted need for PRACTITIONER-SPECIFIC error rates rather than generic discipline rates.

FBI/Noblis Study – Design & Procedure

• Participants: 169 latent-print examiners, diverse experience & agencies.
• Sample Sets
– Each examiner received 100 latent–exemplar pairs (randomized) from a master pool of 744 pairs (both mated & non-mated).
– Variety of quality/complexity intentionally included to capture worst-case error rates.
• Double-blind, open-set, randomized; neither participant identities nor answers were visible to researchers and vice-versa.
• Custom Noblis software prevented revisiting decisions (mimics real-world finality once results reported).
• Decision Pathway (slide flowchart):
– Analysis → Value assessment (VID = value for individualization, VEO = value for exclusion only, No Value)
– Comparison/Evaluation → Four possible calls: Individualization, Exclusion, Inconclusive, No Value

FBI/Noblis Study – Results

• False Positive Rate: 0.1\% (≈1 wrong ID per 1000 identifications).
• False Negative Rate: 7.5\% (≈8 wrong exclusions per 100 exclusions).
• 85\% of participants made at least one false-negative error.
• Graph (slide 33) showed distribution of decision outcomes for mated vs non-mated pairs.

Impact & Subsequent Developments

• Published 2011; rapidly cited in court testimony.
• Downloaded >70{,}000 times; ranks in top 5\% of all research outputs online.
• Follow-on: Research team produced >15 additional papers expanding on latent-print reliability.
• Demonstrated effectiveness due to:
– Independent collaboration.
– Large, diverse examiner pool.
– Inclusion of challenging prints → provides upper-bound error estimates.

Ethical Case Study – Troop C Fingerprint Fabrication (NY State Police, 1984-1992)

• At least 30 criminal cases involved fabricated prints.
• 5 state troopers charged (4 convicted, 1 acquitted).
• Modus operandi:
– Wait until suspect identified, then “discover” suspect prints at scene.
– Took suspect-handled objects during booking; lifted prints and relabeled as scene evidence.
– Photocopying of prints also used.
• Discovery: One trooper candidly explained scheme during a CIA job interview → investigation launched.
• Consequences: Case reviews, overturned convictions, civil lawsuits.

Procedure Reforms Triggered by Troop C Scandal

• Mandatory photography of all latent prints BEFORE lifting.
• Crime-scene reports must be co-signed by at least TWO investigators.
• Fingerprint discoveries must be verified by at least TWO supervisors.
• Emphasizes the critical role of documentation and chain-of-custody integrity.

Key Takeaways & Practical Implications

• Proficiency testing is multifaceted: written knowledge, practical casework, and courtroom testimony all demand periodic assessment.
• Error rates are NOT zero; understanding both false positives and (often higher) false negatives is essential for quality management and courtroom transparency.
• Contextual bias is real—even seasoned experts can reverse prior decisions when exposed to misleading information; blind verification practices are therefore recommended.
• Large-scale, double-blind black-box studies (e.g., FBI/Noblis) supply robust empirical foundations that satisfy legal admissibility standards.
• Technological aids (AFIS) can mitigate, but not eliminate, human error; careful integration and verification protocols are mandatory.
• Ethical lapses (fabrication) underscore the necessity of rigorous documentation, multi-person verification, and strong organizational culture.
• Continuous research, peer review, and evidence-based policy (e.g., CanFRWG guidelines) are pivotal to maintain public trust in fingerprint science.

Looking Ahead

• Next class (Week 5): Two lectures + Wednesday lab focusing on fingerprint comparison exercise → opportunity to apply ACE-V principles and reflect on error-rate literature.