NLM DIR Seminar Schedule
UPCOMING SEMINARS
-
April 8, 2025 Jaya Srivastava
Leveraging a deep learning model to assess the impact of regulatory variants on traits and diseases -
April 15, 2025 Pascal Mutz
TBD -
April 18, 2025 Valentina Boeva, Department of Computer Science, ETH Zurich
Decoding tumor heterogeneity: computational methods for scRNA-seq and spatial omics -
April 22, 2025 Stanley Liang
TBD -
April 29, 2025 MG Hirsch
TBD
RECENT SEMINARS
-
April 1, 2025 Roman Kogay
Horizontal transfer of bacterial operons into eukaryote genomes -
March 25, 2025 Yifan Yang
Adversarial Manipulation and Data Memorization in Large Language Models for Medicine -
March 11, 2025 Sofya Garushyants
Tmn – bacterial anti-phage defense system -
March 4, 2025 Sanasar Babajanyan
Evolution of antivirus defense in prokaryotes depending on the environmental virus load -
Feb. 25, 2025 Zhizheng Wang
GeneAgent: Self-verification Language Agent for Gene Set Analysis using Domain Databases
Scheduled Seminars on March 28, 2023
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Viruses with large double-stranded DNA genomes appear to have captured the majority of their genes from the hosts at different stages of evolution. The origin of many virus genes is readily detected through highly significant sequence similarity with cellular homologs. This is the case, in particular, for virus enzymes, such as DNA and RNA polymerases or nucleotide kinases, that retain their catalytic activity after capture by an ancestral virus. However, a large fraction of virus genes have no readily detectable cellular homologs so that their origin remains enigmatic. We sought to explore potential origins of proteins of unknown provenance encoded in the genomes of orthopoxviruses, a thoroughly studied virus genus which includes major human pathogens. To this end, we used AlphaFold2, to predict the structures of all 214 proteins encoded by orthopoxviruses. Among the proteins of unknown provenance, structure prediction yielded a clear indication of origin for 14, along with validating several inferences previously made by sequence analysis. A notable trend that emerges from these findings is the exaptation of enzymes from cellular organisms for non-enzymatic, structural roles in virus reproduction which is accompanied by disruption of catalytic sites and overall drastic divergence which precludes detection of homology at the sequence level. Among the 16 orthopoxvirus proteins found to be inactivated enzyme derivatives, are the poxvirus replication processivity factor A20, an inactivated derivative of NAD-dependent DNA ligase; major core protein A3, an inactivated deubiquitinase; F11, an inactivated prolyl hydroxylase; and more similar cases. However, for nearly one third of the orthopoxvirus virion proteins, no significantly similar structures were identified, suggesting exaptation with subsequent major structural rearrangement, yielding unique protein folds.