DBDP-autonomic Notes

Introduction

  • Chronic physical and mental health conditions are increasingly prevalent (1).
  • Integrating biological, behavioral, and environmental data is crucial for:
    • Early detection.
    • Just-in-time intervention.
    • Outcome monitoring.
  • Mobile monitoring technologies are transforming data collection in non-clinical settings.
  • Challenges hindering progress:
    • Misinterpretation of peripheral psychophysiological signals.
    • Lack of transparency and reproducibility in biobehavioral research.

The Criticality of Context in ANS Data Interpretation

  • The Autonomic Nervous System (ANS) regulates various physiological processes, including heart rate, blood pressure, respiration, and digestion.
  • Its primary function is to maintain homeostasis.
  • Understanding the context of ANS data collection is crucial for accurate biobehavioral interpretations.
  • Factors influencing ANS activity:
    • Stress, affect, cognition.
    • Physical activity, sleep, illness, medications.
    • Environmental demands.
    • Age, genetics, and health conditions (8, 9).
  • ANS responses vary over time and between individuals.
  • Misinterpreting transient states can occur when context is excluded (e.g., affect, cognition, physical perturbations) (10, 11, 12).
  • Context helps differentiate stimulus-driven shifts from natural fluctuations.
  • Including context enables quantitative assessment of interventions.
  • Software is needed to enable data fusion and multi-modal analysis.
  • Integration of data types could improve digital health initiatives involving ANS data.

Addressing Reproducibility

  • Scientific endeavors face a crisis of method and results reproducibility (13, 14).
  • Large-scale projects indicate non-replicability in multiple fields (15–19).
  • Reported results are frequently incorrect or misstated (20).
  • Reasons for irreproducibility:
    • Lack of interoperability between software tools and data sets.
    • Limited record-keeping for complicated data sets.
  • Crowdsourced analysis projects reveal analytical flexibility in complex analyses (e.g., 29 teams analyzing identical dataset, with odds ratios for effects ranging from 0:890:89 to 2:932:93, M=1:31M = 1:31 (21)).
  • Reproducibility requires identical statistical analyses.
  • The reproducibility crisis is pronounced in ANS data collection and analysis using consumer sensors.
  • The field lacks clear guidelines for analyzing autonomic data.
  • Researchers repeatedly build new algorithms without comparisons to existing approaches.

Envisioning an Open-Source Data Processing Framework

  • Proposed open-source framework to address standardization, interpretation, and reproducibility challenges.
  • Identified 10 common problem areas with existing software:
    • Focus on limited biosignals.
    • "Freemium" models.
    • Designed for small lab datasets.
    • No integration across biosignals.
    • No support for open scientific principles.
    • Unsupported, lacking documentation.
    • Inability to analyze context.
    • Static, not designed for community contributions.
    • No ability to archive analysis pathways.
    • Command-line based, challenging for non-programmers.
  • Proprietary software has similar issues, plus higher costs and closed code.
  • A Survey of User Needs (SUN) confirmed analytical barriers experienced by researchers and engineers.
  • Survey demographics : n=421n = 421, 31% Researchers, 69% Engineers.
  • Survey respondents were from the Society of Psychophysiological Research and the IEEE International Machine Learning for Signal Processing Workshop.
  • Over 70% of researchers faced difficulties syncing data, identifying errors, and combining data from different devices.
  • Respondents found software tools hard to learn and lacking clear instructions.
  • Researchers would benefit from a multimodal data fusion platform with an open-source codebase.
  • SUN respondents were enthusiastic about a community-driven open-source framework.

Core Components of the Framework

Community Driven

  • Sustainability achieved by inviting scientists to contribute methodologies and algorithms as plugins.
  • Researchers can contribute individual tasks or complete end-to-end pipelines.
  • Collaboration addresses reproducibility challenges.
  • Engineering and computer scientists can validate and refine algorithms.
  • Behavioral scientists and clinicians gain access to cutting-edge tools.
  • The framework includes a foundational set of validated tools.
  • Contributed methods undergo validation through benchmarking datasets and automated testing pipelines.
  • Each plugin includes metadata specifying its purpose and validation outcomes.
  • A continuous improvement cycle is created through community feedback.
  • Standardized templates, documentation, and input/output formats facilitate contributions.
  • Data supply chain (38) steps are saved as metadata for transparency and reproducibility.

Data Quality Auditing and Preprocessing

  • The framework audits data quality by identifying motion artifacts, environmental factors, and hardware limitations.
  • Semi-automated modules for data cleaning, preprocessing, and artifact removal are implemented (39–41).

Signal Segmentation and Alignment

  • The framework includes a module for determining appropriate time windows for signal segmentation.
  • The module helps users select optimal window lengths for their research needs.
  • Tools are included to support improved signal alignment (42).

Contextual Information Integration

  • Interpreting physiological data requires understanding the recording context.
  • Context includes environmental characteristics and personal factors (affect, physical activity, posture).
  • Integrating contextual features improves stress detection (12).
  • Plots of physiological data with visual overlays describing recording context are provided.

Data Fusion and Signal Alignment

  • This involves aligning data from different sensors or modalities, which might vary in sampling frequencies and timestamps.
  • Community-contributed plugins can handle the intricacies of multimodal data by harmonizing signals (short & long term).

Programming Language and GUI

  • A graphical user interface (GUI) is needed for open-source physiological processing tools.
  • The framework is accessible through both a GUI and a command-line interface (CLI).
  • Common open programming languages (Python, R, Bash, etc.) are used.

Science Gateways and Open Science Integration

  • The framework aligns with Open Science Framework standards.
  • Integration includes leveraging the Digital Health Data Repository.

DBDP Autonomic

  • Expanding the Digital Biomarker Discovery Project (DBDP (43)) to include dedicated processing of ANS signals.
  • DBDP serves as a hub for collaborative and open research in digital health.
  • The code repository includes computational building blocks for common measures of ambulatory physiological data.
    • photoplethysmography (PPG)
    • electrocardiography (ECG)
    • electrodermal activity (EDA).
  • The repository comprises four modules:
    • exploratory data analysis
    • data preprocessing
    • feature engineering
    • machine learning model development
  • DBDP hosts an archive of code repositories and a list of open-source digital health data.
  • DBDP is also developing a code-free GUI-based platform (DBDP Discovery).
  • Uploaded code adheres to the framework’s plugin guidelines, allowing it to be cloned as a submodule.
  • DBDP-Autonomic specifically targets signals from the autonomic nervous system (ANS) to derive insights into psychological states.

The Role of Context in DBDP-Autonomic

  • A key challenge in analyzing ANS data is the role of context.
  • DBDP-Autonomic addresses this by integrating contextual data (e.g., activity type, location, and environmental factors).
  • DBDP-Autonomic emphasizes the combination of multi-dimensional signals to create unified constructs that reflect psychological states more comprehensively.

Discussion and Call to Community Action

  • DBDP has established itself as an open-source hub for digital health.
  • DBDP Autonomic could address challenges associated with ANS signal analysis.
  • Engagement from the community is vital to the long-term viability of DBDP Autonomic.
  • Envisioned member engagement follows the Center for Scientific Collaboration and Community Engagement (CSCCE) Community Participation Model (44).
  • Members can:
    • CONVEY/CONSUME: Engage with educational resources.
    • CONTRIBUTE: Add data cleaning algorithms, feature selection methods, and machine learning models.
    • COLLABORATE: Undertake joint research initiatives.
    • CO-CREATE: Organize and lead workshops and working groups.

Conclusion

  • A collaborative effort of the DBDP Autonomic community could enable more robust, transparent, and reproducible research in biobehavioral health.
  • By emphasizing collaboration, transparency, and rigor, this resource could improve our understanding of complex biobehavioral health issues.
  • Such a framework is essential for unlocking the full potential of mobile devices to benefit individual and community health.