NLM DIR Seminar Schedule

Seminars Home

Schedule Seminar

UPCOMING SEMINARS

Sept. 9, 2025 Chih-Hsuan Wei
No Data Left Behind: FAIR-SMart Enables FAIR Access to Supplementary Materials for Research Transparency
Sept. 16, 2025 James Leaman JR.
TBD
Sept. 23, 2025 Martha Nelson
TBD
Sept. 30, 2025 Erez Persi
TBD
Oct. 7, 2025 Liana Yeganova
TBD

RECENT SEMINARS

July 15, 2025 Noam Rotenberg
Cell phenotypes in the biomedical literature: a systematic analysis and the NLM CellLink text mining corpus
July 3, 2025 Matthew Diller
Using Ontologies to Make Knowledge Computable
July 1, 2025 Yoshitaka Inoue
Graph-Aware Interpretable Drug Response Prediction and LLM-Driven Multi-Agent Drug-Target Interaction Prediction
June 10, 2025 Aleksandra Foerster
Interactions at pre-bonding distances and bond formation for open p-shell atoms: a step toward biomolecular interaction modeling using electrostatics
June 3, 2025 MG Hirsch
Interactions among subclones and immunity controls melanoma progression

Scheduled Seminars on Jan. 16, 2025

Speaker

Qingqing Zhu

PI/Lab

Time

11 a.m.

Presentation Title

GPTRadScore and CT-Bench: Advancing Multimodal AI Evaluation and Benchmarking in CT Imaging

Location

Hybrid
In-person: Building 38A/B2N14 NCBI Library or Meeting Link

Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.

Abstract:

We introduce GPTRadScore, a groundbreaking evaluation framework for assessing multimodal large language models (LLMs) in CT imaging. Using GPT-4, GPTRadScore measures model performance in tasks like lesion localization, body part identification, and lesion typing. It outperforms traditional metrics such as BLEU and ROUGE, aligning closely with expert clinician assessments. Fine-tuning with specialized datasets significantly boosts performance, as demonstrated by RadFM’s notable improvements in accuracy.
To support the development of AI in CT imaging, we also present CT-Bench, a comprehensive dataset containing 20,335 annotated lesions from 7,795 patient studies. Accompanied by high-quality, GPT-4-enhanced textual descriptions and a visual question-answering (VQA) benchmark with 2,850 QA pairs, CT-Bench enables targeted training and evaluation of AI models for lesion description, localization, and diagnostic reasoning.
Together, GPTRadScore and CT-Bench provide powerful tools to advance multimodal AI, setting new standards for evaluation, training, and performance in CT imaging analysis.

NLM DIR Seminar Schedule

UPCOMING SEMINARS

RECENT SEMINARS

Scheduled Seminars on Jan. 16, 2025

Abstract:

ARCHIVES