NLM DIR Seminar Schedule
UPCOMING SEMINARS
-
April 1, 2025 Roman Kogay
Horizontal transfer of bacterial operons into eukaryote genomes -
April 8, 2025 Jaya Srivastava
TBD -
April 15, 2025 Pascal Mutz
TBD -
April 18, 2025 Valentina Boeva, Department of Computer Science, ETH Zurich
Decoding tumor heterogeneity: computational methods for scRNA-seq and spatial omics -
April 22, 2025 Stanley Liang
TBD
RECENT SEMINARS
-
April 1, 2025 Roman Kogay
Horizontal transfer of bacterial operons into eukaryote genomes -
March 25, 2025 Yifan Yang
Adversarial Manipulation and Data Memorization in Large Language Models for Medicine -
March 11, 2025 Sofya Garushyants
Tmn – bacterial anti-phage defense system -
March 4, 2025 Sanasar Babajanyan
Evolution of antivirus defense in prokaryotes depending on the environmental virus load -
Feb. 25, 2025 Zhizheng Wang
GeneAgent: Self-verification Language Agent for Gene Set Analysis using Domain Databases
Scheduled Seminars on June 22, 2023
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Gene regulation in eukaryotes mainly involves transcription factors (TFs). These proteins bind to regulatory DNA elements such as enhancers and determine the amount and timing of target gene expression. Mutations at TF binding sites (TFBSs) are associated with complex human diseases and traits. Consequently, accurately identifying TFBSs is crucial to pinpoint causal variants.
Computational state-of-the-art algorithms typically use position weight matrices (PWMs) to identify TFBSs in the human genome; however, these algorithms produce too many false positives. Here, we use TREDNet—a deep learning model developed in our research group—to identify TFBSs in HepG2 cell line enhancers accurately. We identify TFBSs at enhancer regions that would damage the enhancer upon mutation, called positive active regions (PARs), and that would strengthen the enhancer upon mutation, called negative active regions (NARs). We found that the NARs are more GC enriched than the PARs. Clustering analysis of the TFBSs at PARs revealed ~10 groups of binding sites. In addition, analysis of TF pair co-occurrence revealed that the forkhead box (FOX) family of TFs is prevalent in PAR regions.