NLM DIR Seminar Schedule
UPCOMING SEMINARS
-
Jan. 20, 2026 Anastasia Gulyaeva
TBD -
Jan. 22, 2026 Mario Flores
AI Pipeline for Characterization of the Tumor Microenvironment -
Jan. 27, 2026 Zhaohui Liang
TBD -
Jan. 29, 2026 Mehdi Bagheri Hamaneh
FastSpel: A simple peptide spectrum predictor that achieves deep learning-level performance at a fraction of the computational cost -
Feb. 3, 2026 Matthew Diller
TBD
RECENT SEMINARS
-
Jan. 8, 2026 Won Gyu Kim
LitSense 2.0: AI-powered biomedical information retrieval with sentence and passage level knowledge discovery -
Dec. 16, 2025 Sarvesh Soni
ArchEHR-QA: A Dataset and Shared Task for Grounded Question Answering from Electronic Health Records -
Dec. 2, 2025 Qingqing Zhu
CT-Bench & CARE-CT: Building Reliable Multimodal AI for Lesion Analysis in Computed Tomography -
Nov. 25, 2025 Jing Wang
MIMIC-EXT-TE: Millions Clinical Temporal Event Time-Series Dataset -
Oct. 21, 2025 Yifan Yang
TBD
Scheduled Seminars on Jan. 17, 2025
In-person: Building 38A/B2N14 NCBI Library or Meeting Link
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Large language models (LLMs) pretrained on massive data have shown their power as foundation models for pervasive tasks in natural language understanding and beyond. This inspired us to develop large cellular models (LCMs) to decipher the transcriptomic language of cells. We have developed LCMs for single-cell transcriptomics toward this goal using two approaches, which produced the two large models scFoundation and scMulan. With pretraining on tens of millions of human scRNA-seq data covering almost all known cell types and states, the models have shown ability of capturing complex context relations among gene expressions and meta attributes of cells. Experiments showed that the pretrained model can achieve state-of-the-art performances in zero-shot manner or with light fine-tuning on a diverse array of single-cell analysis tasks such as data enhancement, drug-response prediction at tissue and single-cell levels, single-cell perturbation prediction, cell type annotation, gene module inference and conditional cell generation.