A

Natural Language Processing in Health Informatics

Natural Language Processing (NLP)

Motivations and Approaches

  • Enabling Use of EHR Data: Successful NLP of clinical narrative text can significantly enhance the utility of Electronic Health Record (EHR) data.

    • Limitations of Coded Data: Current coded data often fails to capture the intricate complexity of clinical narratives (Jollis, 1993; O’Malley, 2005).

    • Information "Locked" in Text: A substantial amount of critical clinical information remains embedded within free-text notes (Hripcsak, 1995; Hripcsak, 2013).

  • Historical Context as Artificial Intelligence (AI):

    • NLP has historically been considered a subfield of AI.

    • A more accurate description for NLP is "natural language understanding."

  • Shift to Machine Learning (ML) Approaches: Similar to other AI applications, the focus has evolved from human-developed rules to ML methods (Deng, 2018).

    • First Era: Characterized by manually developed lexicons, grammars, and algorithms, coinciding with the early stages of AI.

    • Second Era: Involved the application of ML to pre-existing lexicons and grammars.

    • Third Era: Utilizes deep learning techniques applied across lexicons, grammars, and algorithms.

Overview of Clinical NLP Tasks

  • Information Extraction: The process of converting narrative text into structured data.

  • Summarization: Generating concise summaries from larger volumes of content.

  • Text Classification: Categorizing text into predefined types.

  • Information Retrieval (IR): Locating and retrieving relevant documents and other textual information.

  • Question-Answering: Finding specific answers within text.

  • Machine Translation: Translating text from one language to another.

  • Conversational Agents: Developing systems that can engage in human-like conversations.

  • Sentiment Analysis: Determining the emotional tone or sentiment expressed in text.

Use Cases for NLP in Cancer Care (Lingumatics provides examples)

  • Identifying potential matches for clinical trials.

  • Performing advanced information extraction from complex patient documents.

  • Achieving precise information retrieval for clinical case histories and outcomes studies.

  • Streamlining cancer registry processes.

  • Applying predictive models and care coordination rules to unstructured patient narratives.

  • Semantic Enrichment: Improving search capabilities through the semantic enrichment of patient documentation.

  • Analyzing patient narratives for insights into treatment outcomes.

  • Assessing the impact of genetic aberrations on disease.

  • Supporting Tumor Board discussions and decision-making.

Levels of Human Language

  • Phonology: Deals with the sound units that constitute language, known as phonemes.

  • Morphology: Involves the analysis of word parts, called morphemes.

    • Examples: appendic-, pharyng- (roots), -itis, -ectomy (suffixes).

  • Syntax: Refers to the rules governing language construction, essentially grammar.

  • Semantics: Focuses on the meaning of words, phrases, and entire sentences.

  • Pragmatics: Examines how context influences the meaning of sentences and discourse.

  • World Knowledge: General knowledge necessary for understanding language effectively.

Phases of NLP

  • Three Major Phases in Classical NLP:

    • Syntax: The recognition of the grammatical constituents of language.

    • Semantics: The recognition of meaning.

    • Context: The broader framing of the content.

  • Difficulty and Value: Each successive level (semantics, context) is progressively more challenging and demands greater knowledge engineering, but successful solutions at these levels offer significantly higher value.

Major Steps in NLP Phases

  • Syntax via Parsing: Syntax is typically handled through parsing, which necessitates a grammar and rules governing the language's syntax.

    • Rewrite Rules: The most common method for expressing grammar is as a set of rewrite rules.

      • Example: S \rightarrow NP \, VP (e.g., "The patient has severe hypertension.")

      • Example: NP \rightarrow DET \, NP, NP \rightarrow ADJ \, NP, NP \rightarrow NOUN

    • Terminal Symbols: Symbols that cannot be further broken down (e.g., ADJ, NOUN).

    • Non-terminal Symbols: Symbols that can be further decomposed (e.g., S, NP).

  • Semantics via Mapping: Semantics is generally achieved by mapping parts of speech into standardized terminology.

    • Standardized Terminology: SNOMED CT is the most descriptive terminology used for NLP efforts.

Growing Use of ML in NLP

  • Automated Parsing Rules: ML is increasingly used to derive parsing rules, rather than human enumeration.

  • Word Embeddings: Techniques like word2vec are employed to uncover semantic relationships between words.

  • Transformers: These models utilize large training datasets and are often pre-trained, allowing their models to be reused for various downstream tasks (e.g., Bidirectional Encoder Representations from Transformers (BERT)).

Challenges in Processing the Clinical Narrative

  • Increased Difficulty: Clinical narratives present more processing challenges than other text types due to several characteristics.

    • Telegraphic, Elliptical Style: Often written in a concise, incomplete style.

    • Errors: Frequent spelling and/or grammatical errors are common.

    • Linguistic License: Physicians and other clinicians may take liberties with language.

    • Buried Information: Important details can be hidden within routine information.

  • Types of Challenges:

    • Syntactic

    • Semantic

    • Contextual

Syntactic Challenges

  • Incomplete Sentences: Clinical narrative text is syntactically incomplete.

    • Frequency: Approximately half of all sentences are incomplete.

    • Minimal English Sentence: A basic English sentence typically requires a subject-verb-object structure.

    • Examples of Incompleteness (in order of frequency):

      • Deleted verb and object: "Stiff neck and fever" (implied: "[patient has] stiff neck and fever")

      • Deleted verb: "Brain scan negative" (implied: "brain scan [is] negative")

      • Deleted subject and verb: "Positive for heart disease" (implied: "[patient is] positive for heart disease")

      • Deleted subject: "Was seen by local doctor" (implied: "[patient] was seen by local doctor")

Semantic Challenges

  • Word Senses and Meanings: Words can have multiple senses and meanings.

    • Examples:

      • "Murmur is appreciated" (meaning detected, not liked).

      • "Eye drops" (compound meaning).

      • "Mass at 3 o’clock" (refers to position on a clock face or body, not time).

  • Synonymy: Different words or phrases conveying the same meaning.

    • Example: "Epigastric pain after eating" vs. "postprandial stomach discomfort."

  • Polysemy: The same words or phrases having different meanings depending on context.

    • Example: "The PCP of the patient with PCP advised him to stop using PCP" (referring to Primary Care Physician, Pneumocystis Pneumonia, and phencyclidine, respectively).

  • Negation: Commonly used in medical text.

    • Example: "Patient does not have any chest pain."

  • Uncertainty: Expression of doubt or possibility.

    • Example: "Patient treated for possible pneumonia."

  • Temporality: Indicating temporal relationships or historical context.

    • Examples:

      • "Patient has history of pneumonia."

      • "Chest pain resolved after administration of nitroglycerin."

Challenges for Numerical Data in Clinical Notes (Hanauer, 2019)

  • Spelled Out Numbers: Including negatives ("minus"), fractions ("one-half"), dimensions ("two by two"), and ranges ("one to five").

  • Invalid Dates: Incorrectly formatted or impossible dates.

  • Roman Numerals: Such as "IV," "type II," "stage 3" (where 3 is sometimes incorrectly used for stage III).

  • Biologically Implausible Ages: Ages that are medically impossible or highly improbable.

  • Ranking Issues: Correct rankings like "1^{st} " versus incorrect or anomalous forms like "3^{st} ", including with dates.

  • Decades: Terms like "octogenarian."

  • Imprecision: Ambiguous quantities like "a few," "a million."

  • Units: Various forms for units, e.g., "lbs," "pounds."

Contextual Challenges

  • Coreference: The relationship between linguistic expressions that refer to the same real-world entity.

    • Example: "Chest x-ray shows nodule in left upper lobe. The tumor has increased in size to 2 \, cm " (where "The tumor" refers to the "nodule").

    • Anaphora: A specific type of coreference involving pronouns.

      • Example: "He complains of chest pain. It awakens him at night." (where "He" and "him" refer to the same patient, and "It" refers to the "chest pain").

  • Ellipsis: The common deletion of subjects in clinical narratives.

    • Example: "Complains of chest pain. Increasing frequency. Worse in the morning." (Implied subject "[Patient]" for each phrase).

Evaluation of NLP Systems

  • Metrics:

    • Recall: The proportion of correct concepts from the reference standard that were successfully identified by the system.

      • Formula: \text{Recall} = \frac{\text{Number of correct concepts found}}{\text{Total number of correct concepts}}

      • Example: If 75 out of 100 correct concepts are found, recall is 75\%.

    • Precision: The proportion of concepts identified by the system that are actually correct.

      • Formula: \text{Precision} = \frac{\text{Number of correct concepts found}}{\text{Total number of concepts found by system}}

      • Example: If 150 concepts are found by the system, and 75 of them are correct, precision is 50\%.

  • Challenge Evaluations: Often conducted as "challenge evaluations," where multiple research groups benchmark their results on identical tasks.

    • i2b2 NLP Shared Tasks: Historically the largest clinical text challenge evaluation (https://www.i2b2.org/NLP/).

    • National NLP Clinical Challenges (n2c2): The current name for these challenges (https://n2c2.dbmi.hms.harvard.edu/).

Clinical NLP Approaches and Projects

  • Early approaches and systems.

  • Recent efforts and advancements.

  • Systematic reviews of progress in the field.

  • Challenge evaluations provide a means to compare and advance systems.

Early Approaches and Systems

  • Linguistic String Project (Sager, 1987):

    • Proposed that clinical notes represent a "subgrammar" within the broader human grammar.

    • Suggested that most clinical narrative statements could be reduced to a small number of information formats (e.g., medication, test & result).

  • Medical Language Extraction and Encoding System (MedLEE) (Friedman, 1994):

    • Its core approach was a "semantic grammar" that primarily recognized terms and attributes rather than focusing on full syntactic parsing.

    • Initially developed for radiology reports, it later expanded to other clinical domains.

    • When compared with human coders, MedLEE's performance fell within the observed range of inter-coder disagreement (Hripcsak, 1995).

More Recent Clinical NLP Tasks and Results

  • Success Areas: Most success has been in identifying patients and their attributes, though not yet complete extraction of all data.

  • Specific Applications:

    • Identifying postoperative complications (Fitzhenry, 2013; Tien, 2015).

    • Identifying high-risk heart failure patients (Evans, 2016).

    • Predicting ICU risk of death and length of stay (Weissman, 2018).

    • Detecting alcohol misuse (Afshar, 2019).

    • Identifying geriatric syndromes (Chen, 2019).

    • Predicting progression and mortality in cancer (Kehl, 2020).

    • Assessing risk of nosocomial infection (Goodwin, 2020).

    • Extracting social determinants of health (Feller, 2020).

Clinical NLP Tasks and Results (Cont.)

  • Measuring Healthcare Quality:

    • Determining healthcare quality measures (Hazlehurst, 2005; Yetisgen, 2015; Kim, 2017, Meystre, 2017).

    • Implementing these measures in practical clinical settings (Garvin, 2018).

  • Assisting Patients:

    • Linking EHR language to layperson definitions (Chen, 2018).

  • Conversational Agents:

    • Assisting physicians with prescribing by connecting to knowledge-based information (Preininger, 2020).

Clinical NLP Tasks and Results (Cont.)

  • Augmenting Clinical Research:

    • Finding patients with congestive heart failure (Pakhomov, 2007).

    • Case detection of diabetes (Zheng, 2016).

    • Investigating the association between androgen deprivation therapy and risk of dementia (Nead, 2017).

    • Extracting outcomes for cancer patients from radiology reports (Kehl, 2019) and pathology reports (Alawad, 2020).

    • Cohort selection for clinical studies (Wang, 2019; Chamberlin, 2020).

    • Classifying patients into phenotypes using deep learning (Si, 2021).

Electronic Medical Records and Genomics (eMERGE) Network

  • Website: https://emerge-network.org/

  • Genotype to Phenotype Link: Recalling basic genetics, the genotype (genes in DNA) determines the phenotype (expressed characteristics of an organism).

  • Consortium Goal: This large-scale consortium aims to integrate a growing number of DNA biorepositories with EHR systems to facilitate "large-scale, high-throughput genetic research."

    • Specifically, linking patient phenotype data with their genotype.

  • Limitations of ICD-9/10: For most phenotypes, ICD-9/10 codes are insufficient; NLP applied to text notes and reports, along with medication data, provides higher accuracy in identification.

i2b2/n2c2 Challenge Evaluations

  • Annual Challenges: These challenges include overviews and system papers.

    • Data sets available at: https://www.i2b2.org/NLP/DataSets/Main.php

  • Renamed: Now known as National NLP Clinical Challenges (n2c2).

    • Website: https://n2c2.dbmi.hms.harvard.edu/

More Recent Systematic Reviews

  • Limitations of Early Systems (Kreimayer, 2017): Many systems were used but often had a narrow focus of tasks, relied on institution-specific data, and utilized small datasets.

  • Expanding Scope (Wang, 2018): Noted a growing range of note types and application areas for NLP.

  • Symptom Detection (Koleck, 2019): Assessment of systems specifically designed for detecting symptoms.

  • Patient-Authored Text (Dreisbach, 2019): Reviewed systems extracting symptoms from text data created by patients themselves.

  • Deep Learning Growth (Wu, 2020; Si, 2020): Highlighted the increasing adoption of deep learning for NLP and for deep representation learning of patient data.

  • ML Use (Spasic, 2020): General assessment of the growing use of machine learning in clinical NLP.

Open Source and Commercial NLP Systems

  • Open-Source Systems:

    • MetaMap: From the National Library of Medicine (NLM), leveraging the UMLS Metathesaurus.

      • Website: https://metamap.nlm.nih.gov/

      • MetaMap Lite: A simplified and faster version: https://metamap.nlm.nih.gov/MetaMap Lite.shtml

    • cTAKES: Developed by Mayo Clinic.

      • Website: https://ctakes.apache.org

    • Canary: From Brigham & Women’s Hospital.

      • Website: http://canary.bwh.harvard.edu/

    • CLAMP: An out-of-the-box system from UT Houston.

      • Website: https://clamp.uth.edu/

  • Commercial Systems:

    • Nuance:

      • Website: https://www.nuance.com/omni-channel-customer-engagement/technologies/natural-language-understanding.html

    • Lingumatics:

      • Website: https://www.linguamatics.com/

    • M*Modal: Acquired by 3M.

    • Discern nCode: Acquired by Cerner.

    • Health Fidelity: A commercial version of MedLEE.

      • Website: https://healthfidelity.com/