NLM DIR Seminar Schedule
UPCOMING SEMINARS
RECENT SEMINARS
-
Dec. 17, 2024 Joey Thole
Training set associations drive AlphaFold initial predictions of fold-switching proteins -
Dec. 10, 2024 Amr Elsawy
AI for Age-Related Macular Degeneration on Optical Coherence Tomography -
Dec. 3, 2024 Sarvesh Soni
Toward Relieving Clinician Burden by Automatically Generating Progress Notes -
Nov. 19, 2024 Benjamin Lee
Reiterative Translation in Stop-Free Circular RNAs -
Nov. 12, 2024 Devlina Chakravarty
Fold-switching reveals blind spots in AlphaFold predictions
Scheduled Seminars on Dec. 3, 2024
In-person: Building 38A/B2N14 NCBI Library or Zoom
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Regular documentation of progress notes is one of the main contributors to clinician burden. The abundance of structured chart information in medical records further exacerbates the burden, however, it also presents an opportunity to automate the generation of progress notes. Thus, we propose a task to automate progress note generation using structured or tabular information present in electronic health records. To this end, we present a novel framework and a large dataset, ChartPNG, for the task which contains 7089 annotation instances (each having a pair of progress notes and interim structured chart data) across 1616 patients. We established baselines on the dataset using locally run large language models (LLMs) from general and biomedical domains. Further, we performed automated and manual analyses to identify the challenges with the proposed task and opportunities for future research.
Due to the sensitive nature of patient electronic health records (EHRs), locally run models are preferred for various reasons including privacy, bias, and cost. However, most open-source locally run models (including medical-specific) are much smaller with limited input context size compared to the more powerful closed-source LLMs generally available through web APIs (Application Programming Interfaces). To address these challenges, we propose a framework to harness superior reasoning capabilities and medical knowledge from closed-source online LLMs in a privacy-preserving manner and seamlessly incorporate it into locally run models. Specifically, we leverage a web-based model to distill the vast patient information available in EHRs into a clinically relevant subset without sending sensitive patient health information online and use this distilled knowledge to generate progress notes by a locally run model. Our ablation results indicated that the proposed framework improves the performance of the baseline models on progress note generation by up to 4.6 points on ROUGE (a text-matching based metric) and 7.56 points on MEDCON F1 (a metric that measures the clinical concepts overlap).