Natural Language Processing in Health Informatics
Natural Language Processing (NLP)
Motivations and Approaches
Enabling Use of EHR Data: Successful NLP of clinical narrative text can significantly enhance the utility of Electronic Health Record (EHR) data.
Limitations of Coded Data: Current coded data often fails to capture the intricate complexity of clinical narratives (Jollis, 1993; O’Malley, 2005).
Information "Locked" in Text: A substantial amount of critical clinical information remains embedded within free-text notes (Hripcsak, 1995; Hripcsak, 2013).
Historical Context as Artificial Intelligence (AI):
NLP has historically been considered a subfield of AI.
A more accurate description for NLP is "natural language understanding."
Shift to Machine Learning (ML) Approaches: Similar to other AI applications, the focus has evolved from human-developed rules to ML methods (Deng, 2018).
First Era: Characterized by manually developed lexicons, grammars, and algorithms, coinciding with the early stages of AI.
Second Era: Involved the application of ML to pre-existing lexicons and grammars.
Third Era: Utilizes deep learning techniques applied across lexicons, grammars, and algorithms.
Overview of Clinical NLP Tasks
Information Extraction: The process of converting narrative text into structured data.
Summarization: Generating concise summaries from larger volumes of content.
Text Classification: Categorizing text into predefined types.
Information Retrieval (IR): Locating and retrieving relevant documents and other textual information.
Question-Answering: Finding specific answers within text.
Machine Translation: Translating text from one language to another.
Conversational Agents: Developing systems that can engage in human-like conversations.
Sentiment Analysis: Determining the emotional tone or sentiment expressed in text.
Use Cases for NLP in Cancer Care (Lingumatics provides examples)
Identifying potential matches for clinical trials.
Performing advanced information extraction from complex patient documents.
Achieving precise information retrieval for clinical case histories and outcomes studies.
Streamlining cancer registry processes.
Applying predictive models and care coordination rules to unstructured patient narratives.
Semantic Enrichment: Improving search capabilities through the semantic enrichment of patient documentation.
Analyzing patient narratives for insights into treatment outcomes.
Assessing the impact of genetic aberrations on disease.
Supporting Tumor Board discussions and decision-making.
Levels of Human Language
Phonology: Deals with the sound units that constitute language, known as phonemes.
Morphology: Involves the analysis of word parts, called morphemes.
Examples:
appendic-
,pharyng-
(roots),-itis
,-ectomy
(suffixes).
Syntax: Refers to the rules governing language construction, essentially grammar.
Semantics: Focuses on the meaning of words, phrases, and entire sentences.
Pragmatics: Examines how context influences the meaning of sentences and discourse.
World Knowledge: General knowledge necessary for understanding language effectively.
Phases of NLP
Three Major Phases in Classical NLP:
Syntax: The recognition of the grammatical constituents of language.
Semantics: The recognition of meaning.
Context: The broader framing of the content.
Difficulty and Value: Each successive level (semantics, context) is progressively more challenging and demands greater knowledge engineering, but successful solutions at these levels offer significantly higher value.
Major Steps in NLP Phases
Syntax via Parsing: Syntax is typically handled through parsing, which necessitates a grammar and rules governing the language's syntax.
Rewrite Rules: The most common method for expressing grammar is as a set of rewrite rules.
Example: S \rightarrow NP \, VP (e.g., "The patient has severe hypertension.")
Example: NP \rightarrow DET \, NP, NP \rightarrow ADJ \, NP, NP \rightarrow NOUN
Terminal Symbols: Symbols that cannot be further broken down (e.g., ADJ, NOUN).
Non-terminal Symbols: Symbols that can be further decomposed (e.g., S, NP).
Semantics via Mapping: Semantics is generally achieved by mapping parts of speech into standardized terminology.
Standardized Terminology: SNOMED CT is the most descriptive terminology used for NLP efforts.
Growing Use of ML in NLP
Automated Parsing Rules: ML is increasingly used to derive parsing rules, rather than human enumeration.
Word Embeddings: Techniques like word2vec are employed to uncover semantic relationships between words.
Transformers: These models utilize large training datasets and are often pre-trained, allowing their models to be reused for various downstream tasks (e.g., Bidirectional Encoder Representations from Transformers (BERT)).
Challenges in Processing the Clinical Narrative
Increased Difficulty: Clinical narratives present more processing challenges than other text types due to several characteristics.
Telegraphic, Elliptical Style: Often written in a concise, incomplete style.
Errors: Frequent spelling and/or grammatical errors are common.
Linguistic License: Physicians and other clinicians may take liberties with language.
Buried Information: Important details can be hidden within routine information.
Types of Challenges:
Syntactic
Semantic
Contextual
Syntactic Challenges
Incomplete Sentences: Clinical narrative text is syntactically incomplete.
Frequency: Approximately half of all sentences are incomplete.
Minimal English Sentence: A basic English sentence typically requires a subject-verb-object structure.
Examples of Incompleteness (in order of frequency):
Deleted verb and object: "Stiff neck and fever" (implied: "[patient has] stiff neck and fever")
Deleted verb: "Brain scan negative" (implied: "brain scan [is] negative")
Deleted subject and verb: "Positive for heart disease" (implied: "[patient is] positive for heart disease")
Deleted subject: "Was seen by local doctor" (implied: "[patient] was seen by local doctor")
Semantic Challenges
Word Senses and Meanings: Words can have multiple senses and meanings.
Examples:
"Murmur is appreciated" (meaning detected, not liked).
"Eye drops" (compound meaning).
"Mass at 3 o’clock" (refers to position on a clock face or body, not time).
Synonymy: Different words or phrases conveying the same meaning.
Example: "Epigastric pain after eating" vs. "postprandial stomach discomfort."
Polysemy: The same words or phrases having different meanings depending on context.
Example: "The PCP of the patient with PCP advised him to stop using PCP" (referring to Primary Care Physician, Pneumocystis Pneumonia, and phencyclidine, respectively).
Negation: Commonly used in medical text.
Example: "Patient does not have any chest pain."
Uncertainty: Expression of doubt or possibility.
Example: "Patient treated for possible pneumonia."
Temporality: Indicating temporal relationships or historical context.
Examples:
"Patient has history of pneumonia."
"Chest pain resolved after administration of nitroglycerin."
Challenges for Numerical Data in Clinical Notes (Hanauer, 2019)
Spelled Out Numbers: Including negatives ("minus"), fractions ("one-half"), dimensions ("two by two"), and ranges ("one to five").
Invalid Dates: Incorrectly formatted or impossible dates.
Roman Numerals: Such as "IV," "type II," "stage 3" (where 3 is sometimes incorrectly used for stage III).
Biologically Implausible Ages: Ages that are medically impossible or highly improbable.
Ranking Issues: Correct rankings like "1^{st} " versus incorrect or anomalous forms like "3^{st} ", including with dates.
Decades: Terms like "octogenarian."
Imprecision: Ambiguous quantities like "a few," "a million."
Units: Various forms for units, e.g., "lbs," "pounds."
Contextual Challenges
Coreference: The relationship between linguistic expressions that refer to the same real-world entity.
Example: "Chest x-ray shows nodule in left upper lobe. The tumor has increased in size to 2 \, cm " (where "The tumor" refers to the "nodule").
Anaphora: A specific type of coreference involving pronouns.
Example: "He complains of chest pain. It awakens him at night." (where "He" and "him" refer to the same patient, and "It" refers to the "chest pain").
Ellipsis: The common deletion of subjects in clinical narratives.
Example: "Complains of chest pain. Increasing frequency. Worse in the morning." (Implied subject "[Patient]" for each phrase).
Evaluation of NLP Systems
Metrics:
Recall: The proportion of correct concepts from the reference standard that were successfully identified by the system.
Formula: \text{Recall} = \frac{\text{Number of correct concepts found}}{\text{Total number of correct concepts}}
Example: If 75 out of 100 correct concepts are found, recall is 75\%.
Precision: The proportion of concepts identified by the system that are actually correct.
Formula: \text{Precision} = \frac{\text{Number of correct concepts found}}{\text{Total number of concepts found by system}}
Example: If 150 concepts are found by the system, and 75 of them are correct, precision is 50\%.
Challenge Evaluations: Often conducted as "challenge evaluations," where multiple research groups benchmark their results on identical tasks.
i2b2 NLP Shared Tasks: Historically the largest clinical text challenge evaluation (https://www.i2b2.org/NLP/).
National NLP Clinical Challenges (n2c2): The current name for these challenges (https://n2c2.dbmi.hms.harvard.edu/).
Clinical NLP Approaches and Projects
Early approaches and systems.
Recent efforts and advancements.
Systematic reviews of progress in the field.
Challenge evaluations provide a means to compare and advance systems.
Early Approaches and Systems
Linguistic String Project (Sager, 1987):
Proposed that clinical notes represent a "subgrammar" within the broader human grammar.
Suggested that most clinical narrative statements could be reduced to a small number of information formats (e.g., medication, test & result).
Medical Language Extraction and Encoding System (MedLEE) (Friedman, 1994):
Its core approach was a "semantic grammar" that primarily recognized terms and attributes rather than focusing on full syntactic parsing.
Initially developed for radiology reports, it later expanded to other clinical domains.
When compared with human coders, MedLEE's performance fell within the observed range of inter-coder disagreement (Hripcsak, 1995).
More Recent Clinical NLP Tasks and Results
Success Areas: Most success has been in identifying patients and their attributes, though not yet complete extraction of all data.
Specific Applications:
Identifying postoperative complications (Fitzhenry, 2013; Tien, 2015).
Identifying high-risk heart failure patients (Evans, 2016).
Predicting ICU risk of death and length of stay (Weissman, 2018).
Detecting alcohol misuse (Afshar, 2019).
Identifying geriatric syndromes (Chen, 2019).
Predicting progression and mortality in cancer (Kehl, 2020).
Assessing risk of nosocomial infection (Goodwin, 2020).
Extracting social determinants of health (Feller, 2020).
Clinical NLP Tasks and Results (Cont.)
Measuring Healthcare Quality:
Determining healthcare quality measures (Hazlehurst, 2005; Yetisgen, 2015; Kim, 2017, Meystre, 2017).
Implementing these measures in practical clinical settings (Garvin, 2018).
Assisting Patients:
Linking EHR language to layperson definitions (Chen, 2018).
Conversational Agents:
Assisting physicians with prescribing by connecting to knowledge-based information (Preininger, 2020).
Clinical NLP Tasks and Results (Cont.)
Augmenting Clinical Research:
Finding patients with congestive heart failure (Pakhomov, 2007).
Case detection of diabetes (Zheng, 2016).
Investigating the association between androgen deprivation therapy and risk of dementia (Nead, 2017).
Extracting outcomes for cancer patients from radiology reports (Kehl, 2019) and pathology reports (Alawad, 2020).
Cohort selection for clinical studies (Wang, 2019; Chamberlin, 2020).
Classifying patients into phenotypes using deep learning (Si, 2021).
Electronic Medical Records and Genomics (eMERGE) Network
Website: https://emerge-network.org/
Genotype to Phenotype Link: Recalling basic genetics, the genotype (genes in DNA) determines the phenotype (expressed characteristics of an organism).
Consortium Goal: This large-scale consortium aims to integrate a growing number of DNA biorepositories with EHR systems to facilitate "large-scale, high-throughput genetic research."
Specifically, linking patient phenotype data with their genotype.
Limitations of ICD-9/10: For most phenotypes, ICD-9/10 codes are insufficient; NLP applied to text notes and reports, along with medication data, provides higher accuracy in identification.
i2b2/n2c2 Challenge Evaluations
Annual Challenges: These challenges include overviews and system papers.
Data sets available at: https://www.i2b2.org/NLP/DataSets/Main.php
Renamed: Now known as National NLP Clinical Challenges (n2c2).
Website: https://n2c2.dbmi.hms.harvard.edu/
More Recent Systematic Reviews
Limitations of Early Systems (Kreimayer, 2017): Many systems were used but often had a narrow focus of tasks, relied on institution-specific data, and utilized small datasets.
Expanding Scope (Wang, 2018): Noted a growing range of note types and application areas for NLP.
Symptom Detection (Koleck, 2019): Assessment of systems specifically designed for detecting symptoms.
Patient-Authored Text (Dreisbach, 2019): Reviewed systems extracting symptoms from text data created by patients themselves.
Deep Learning Growth (Wu, 2020; Si, 2020): Highlighted the increasing adoption of deep learning for NLP and for deep representation learning of patient data.
ML Use (Spasic, 2020): General assessment of the growing use of machine learning in clinical NLP.
Open Source and Commercial NLP Systems
Open-Source Systems:
MetaMap: From the National Library of Medicine (NLM), leveraging the UMLS Metathesaurus.
Website: https://metamap.nlm.nih.gov/
MetaMap Lite: A simplified and faster version: https://metamap.nlm.nih.gov/MetaMap Lite.shtml
cTAKES: Developed by Mayo Clinic.
Website: https://ctakes.apache.org
Canary: From Brigham & Women’s Hospital.
Website: http://canary.bwh.harvard.edu/
CLAMP: An out-of-the-box system from UT Houston.
Website: https://clamp.uth.edu/
Commercial Systems:
Nuance:
Website: https://www.nuance.com/omni-channel-customer-engagement/technologies/natural-language-understanding.html
Lingumatics:
Website: https://www.linguamatics.com/
M*Modal: Acquired by 3M.
Discern nCode: Acquired by Cerner.
Health Fidelity: A commercial version of MedLEE.
Website: https://healthfidelity.com/