NLM DIR Seminar Schedule
UPCOMING SEMINARS
RECENT SEMINARS
-
May 2, 2025 Pascal Mutz
Characterization of covalently closed cirular RNAs detected in (meta)transcriptomic data -
May 2, 2025 Dr. Lang Wu
Integration of multi-omics data in epidemiologic research -
April 22, 2025 Stanley Liang, PhD
Large Vision Model for medical knowledge adaptation -
April 18, 2025 Valentina Boeva, Department of Computer Science, ETH Zurich
Decoding tumor heterogeneity: computational methods for scRNA-seq and spatial omics -
April 8, 2025 Jaya Srivastava
Leveraging a deep learning model to assess the impact of regulatory variants on traits and diseases
Scheduled Seminars on Feb. 25, 2025
In-person: Building 38A/B2N14 NCBI Library or Meeting Link
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Gene set analysis allows researchers to explore groups of genes that likely act together in specific biological processes or molecular functions. Recent work in gene set analysis has shown promising performance utilizing large language models (LLMs). Nonetheless, their results are subject to limitations common in LLMs, such as hallucinations. In response, we develop GeneAgent, the first language agent for gene set analysis that self-verifies by autonomously interacting with biological databases, reducing hallucinations and enhancing accuracy. GeneAgent generates novel function names or aligns with notable enriched terms for input gene sets. Benchmarking on 1,106 gene sets from different sources, GeneAgent consistently outperforms vanilla GPT-4 by a significant margin. A detailed manual review confirms the effectiveness of the self-verification module in minimizing hallucinations and generating a more reliable explanatory analysis. We also apply GeneAgent to seven novel gene sets derived from mouse B2905 melanoma cell lines, with expert evaluations showing that GeneAgent offers novel insights into gene functions and expediting knowledge discovery.