NLM DIR Seminar Schedule
UPCOMING SEMINARS
RECENT SEMINARS
-
June 11, 2026 Angela Jiang
Identification and Evolutionary Analysis of Steroid-Metabolism Enzymes in Gut Microbes -
June 10, 2026 Luda Diatchenko
New Insights on Pain Biology from Human Transcriptomics: How Stimulation of Immune Response Shapes Pain Resolution -
June 9, 2026 Pascal Mutz
Characterization of covalently closed circular RNA replicators detected in (meta)transcriptomic data -
June 4, 2026 Madeleine Clore
Explaining why AlphaFold struggles to predict mutational effects -
May 27, 2026 Brian Abraham
Cis-Regulatory Organization and Transcription Factor Control of Cell Identity and Disease
Scheduled Seminars on Jan. 27, 2026
In-person: Building 38A/B2N14 NCBI Library or Meeting Link
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Cross-modal retrieval of medical radiographs is a critical component of clinical decision support, cohort discovery, and large-scale data reuse. While CLIP-based vision–language models enable effective zero-shot retrieval, ranking based solely on embedding similarity does not explicitly capture higher-order relationships among images, reports, and clinical semantics. We propose a heterogeneous graph re-ranking framework that augments CLIP-based retrieval with structured relational reasoning while keeping the backbone representation model frozen. Starting from an initial CLIP ranking, the method constructs a heterogeneous k-nearest-neighbor graph over image and report embeddings and applies relation-aware message passing to refine candidate rankings.
We instantiate the framework using three representative graph neural network layer variants (GraphSAGE, GCN, and GAT), and evaluate it on chest radiograph retrieval using the OpenI-CXR and MIMIC-CXR datasets under both within-dataset validation and cross-dataset transfer. On the smaller OpenI dataset, heterogeneous graph re-ranking yields substantial improvements, with GraphSAGE increasing Strong MRR by 47.7%, Precision@10 by 58.2%, and mAP@10 by 45.3%, alongside consistent gains in nDCG. Text-to-image retrieval benefits most, with MRR improving from 0.254 to 0.384 (50.8%). On the larger MIMIC-CXR dataset, gains are more moderate but consistent: GAT improves Strong Precision@10 by 8.5% and mAP@20 by 4.9%, while GraphSAGE enhances weak retrieval performance and normal CXR screening accuracy by up to 3.1%. Cross-dataset experiments further show that heterogeneous graph re-ranking improves robustness relative to embedding-only retrieval, with attention-based models providing the most stable transfer performance.
Overall, these results demonstrate that heterogeneous graph re-ranking is an effective and practical extension to CLIP-based medical cross-modal retrieval, improving ranking quality, clinically relevant screening performance, and generalization without modifying the underlying vision–language encoder.