NLM DIR Seminar Schedule
UPCOMING SEMINARS
-
April 8, 2025 Jaya Srivastava
Leveraging a deep learning model to assess the impact of regulatory variants on traits and diseases -
April 15, 2025 Pascal Mutz
TBD -
April 18, 2025 Valentina Boeva, Department of Computer Science, ETH Zurich
Decoding tumor heterogeneity: computational methods for scRNA-seq and spatial omics -
April 22, 2025 Stanley Liang
TBD -
April 29, 2025 MG Hirsch
TBD
RECENT SEMINARS
-
April 1, 2025 Roman Kogay
Horizontal transfer of bacterial operons into eukaryote genomes -
March 25, 2025 Yifan Yang
Adversarial Manipulation and Data Memorization in Large Language Models for Medicine -
March 11, 2025 Sofya Garushyants
Tmn – bacterial anti-phage defense system -
March 4, 2025 Sanasar Babajanyan
Evolution of antivirus defense in prokaryotes depending on the environmental virus load -
Feb. 25, 2025 Zhizheng Wang
GeneAgent: Self-verification Language Agent for Gene Set Analysis using Domain Databases
Scheduled Seminars on Jan. 20, 2022
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Since a genome is essentially a document written in the alphabet of nucleotides, the field of Computational Biology has been informed by Natural Language Processing techniques since its inception. In this talk I will describe how "MinHash", a relatively obscure algorithm developed for searching the web, has been transformative for the task of genomic similarity estimation. I will go into how and why the algorithm works for sequences of nucleotides and amino acids rather than natural language documents, and I will discuss the creation and validation of tools employing the algorithm, variations for different kinds of searches, and the range of applications it can help with.