NLM DIR Seminar Schedule
UPCOMING SEMINARS
-
March 16, 2026 Janani Ravi, PhD
A bug’s life: a data integration view of microbial genotypes, phenotypes, and diseases -
March 17, 2026 Roman Kogay
Diversification vs Streamlining: Selection Landscapes of Prokaryotic Genome Evolution -
March 24, 2026 Myeongsang Lee
TBD -
March 31, 2026 Yoshitaka Inoue
TBD -
April 7, 2026 Henrry Secaira Morocho
TBD
RECENT SEMINARS
-
March 10, 2026 Zhizheng Wang
Large Language Models for Gene Set Analysis -
March 5, 2026 Hasan Balci
From Sketch to SBGN: An AI-Assisted and Interactive Workflow for Generating Pathway Maps -
March 3, 2026 Gianlucca Goncalves Nicastro
Systematic identification of Salmonella T6SS effectors uncovers a lipid-targeting family. -
Feb. 24, 2026 Ajith Viswanathan Asari Pankajam
Systematic Evaluation of Gene Markers in Single-Cell Tissue Atlases -
Feb. 19, 2026 Jean Thierry-Mieg
On Magic2, an innovative hardware-friendly RNA-seq analyzer
Scheduled Seminars on Feb. 17, 2026
In-person: Building 38A/B2N14 NCBI Library or Meeting Link
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Cross-modal retrieval of medical radiographs is a critical component of clinical decision support, cohort discovery, and large-scale data reuse. While CLIP-based vision–language models enable effective zero-shot retrieval, ranking based solely on embedding similarity does not explicitly capture higher-order relationships among images, reports, and clinical semantics. We propose a heterogeneous graph re-ranking framework that augments CLIP-based retrieval with structured relational reasoning while keeping the backbone representation model frozen. Starting from an initial CLIP ranking, the method constructs a heterogeneous k-nearest-neighbor graph over image and report embeddings and applies relation-aware message passing to refine candidate rankings.
We instantiate the framework using three representative graph neural network layer variants (GraphSAGE, GCN, and GAT), and evaluate it on chest radiograph retrieval using the OpenI-CXR and MIMIC-CXR datasets under both within-dataset validation and cross-dataset transfer. On the smaller OpenI dataset, heterogeneous graph re-ranking yields substantial improvements, with GraphSAGE increasing Strong MRR by 47.7%, Precision@10 by 58.2%, and mAP@10 by 45.3%, alongside consistent gains in nDCG. Text-to-image retrieval benefits most, with MRR improving from 0.254 to 0.384 (50.8%). On the larger MIMIC-CXR dataset, gains are more moderate but consistent: GAT improves Strong Precision@10 by 8.5% and mAP@20 by 4.9%, while GraphSAGE enhances weak retrieval performance and normal CXR screening accuracy by up to 3.1%. Cross-dataset experiments further show that heterogeneous graph re-ranking improves robustness relative to embedding-only retrieval, with attention-based models providing the most stable transfer performance.
Overall, these results demonstrate that heterogeneous graph re-ranking is an effective and practical extension to CLIP-based medical cross-modal retrieval, improving ranking quality, clinically relevant screening performance, and generalization without modifying the underlying vision–language encoder.