NLM DIR Seminar Schedule

Seminars Home

Schedule Seminar

UPCOMING SEMINARS

Sept. 9, 2025 Chih-Hsuan Wei
No Data Left Behind: FAIR-SMart Enables FAIR Access to Supplementary Materials for Research Transparency
Sept. 16, 2025 James Leaman JR.
TBD
Sept. 23, 2025 Martha Nelson
TBD
Sept. 30, 2025 Erez Persi
TBD
Oct. 7, 2025 Liana Yeganova
TBD

RECENT SEMINARS

July 15, 2025 Noam Rotenberg
Cell phenotypes in the biomedical literature: a systematic analysis and the NLM CellLink text mining corpus
July 3, 2025 Matthew Diller
Using Ontologies to Make Knowledge Computable
July 1, 2025 Yoshitaka Inoue
Graph-Aware Interpretable Drug Response Prediction and LLM-Driven Multi-Agent Drug-Target Interaction Prediction
June 10, 2025 Aleksandra Foerster
Interactions at pre-bonding distances and bond formation for open p-shell atoms: a step toward biomolecular interaction modeling using electrostatics
June 3, 2025 MG Hirsch
Interactions among subclones and immunity controls melanoma progression

Scheduled Seminars on March 25, 2025

Speaker

Yifan Yang

PI/Lab

Zhiyong Lu

Time

11 a.m.

Presentation Title

Adversarial Manipulation and Data Memorization in Large Language Models for Medicine

Location

Hybrid - TEAMS

Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.

Abstract:

Large language models (LLMs) have been integrated into numerous biomedical application frameworks. Despite their significant potential, they possess vulnerabilities that can lead to serious consequences. In this seminar, we will discuss two types of vulnerabilities in LLMs and explore potential solutions to address them: adversarial manipulations and data memorization.
Adversarial manipulations can cause LLMs to generate harmful medical suggestions or promote specific stakeholder interests. We will demonstrate two methods by which a malicious actor can achieve this: prompt injection and data poisoning. Thirteen models were tested, and all exhibited significant behavioral changes after manipulation in three tasks. Although newer models performed slightly better, they were still greatly affected.
Data memorization is another concern, particularly when LLMs are fine-tuned with medical corpora or patient records for specific tasks. This can lead to the unintended memorization of training data, resulting in the exposure of sensitive patient information and breaches of confidentiality. Controlled text generation can be employed to mitigate such memorization, effectively reducing the risk of exposing patient information during inference and enhancing privacy protection.

NLM DIR Seminar Schedule

UPCOMING SEMINARS

RECENT SEMINARS

Scheduled Seminars on March 25, 2025

Abstract:

ARCHIVES