NLM DIR Seminar Schedule
UPCOMING SEMINARS
-
May 20, 2025 Ajith Pankajam
A roadmap from single cell to knowledge graph -
May 27, 2025 Ermin Hodzic
TBD -
May 29, 2025 Harutyun Saakyan
TBD -
June 3, 2025 MG Hirsch
TBD -
June 10, 2025 Aleksandra Foerster
TBD
RECENT SEMINARS
-
May 20, 2025 Ajith Pankajam
A roadmap from single cell to knowledge graph -
May 2, 2025 Pascal Mutz
Characterization of covalently closed cirular RNAs detected in (meta)transcriptomic data -
May 2, 2025 Dr. Lang Wu
Integration of multi-omics data in epidemiologic research -
April 22, 2025 Stanley Liang, PhD
Large Vision Model for medical knowledge adaptation -
April 18, 2025 Valentina Boeva, Department of Computer Science, ETH Zurich
Decoding tumor heterogeneity: computational methods for scRNA-seq and spatial omics
Scheduled Seminars on May 20, 2025
In-person: Building 38A/B2N14 NCBI Library or Meeting Link
Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.
Abstract:
Recent advancements in single cell sequencing technologies have significantly enhanced our understanding of the phenotype of cell populations by enabling high-resolution profiling at the level of individual cells. In contrast to bulk sequencing approaches, these technologies such as single-cell RNA sequencing (scRNA-seq), ATAC-seq, and multimodal platforms including CITE-seq and spatial transcriptomics have revolutionized disease research by allowing precise characterization of cellular populations within complex tissues. Unlike dissociative single-cell techniques, spatial transcriptomics preserves the spatial organization of gene expression within tissues, enabling researchers to map the location and interactions of individual cells in their native microenvironments. By resolving cellular heterogeneity, single-cell analyses reveal aberrant gene expression patterns, altered chromatin accessibility, and functional changes in rare or previously uncharacterized cellular subpopulations under various pathological conditions, providing insights into the molecular mechanisms driving disease. The high-resolution nature of single-cell data has enabled the construction of comprehensive cell atlases and integrative knowledge graph (KG) databases that incorporate multidimensional genomic and proteomic information. These resources serve as critical foundations for advancing precision medicine, facilitating more accurate diagnostics and the development of targeted therapeutic strategies.
Analyzing single cell sequencing data involves a series of complex steps, among which, clustering plays a critical role. Achieving optimal clustering remains a challenging task. In this study, we utilized NS-Forest, a machine learning-based combinatorial marker gene identification method, to help identify sub-optimal clustering from retinal pigment epithelium (RPE) single-cell data. To generate the RPE global marker, we ran NS-Forest using the original RPE cell annotations. For identifying RPE-subcluster global and local markers, we created two distinct datasets. In the first dataset, the original RPE annotations were replaced with the RPE-subcluster annotations. The second dataset included only the cells having RPE-subclusters annotations. NS-Forest was used to identify markers in both datasets. By integrating the RPE-subcluster local markers with the RPE global markers, we improved the identification of RPE-subcluster global markers. We also propose a standard way of naming the newly identified cell types clusters based on their NS-Forest markers and the Cell Ontology information available. Furthermore, we standardized a filtering criterion for designing gene probes for spatial transcriptomics experiments.
Using a validated computational analysis pipeline, we developed an initial version of a knowledge graph (KG) populated with single-cell transcriptomic data from lung and eye tissues. The KG is further enriched through the integration of multiple publicly available biomedical resources, including DrugBank, PubMed, ChEMBL, openTargets. This integrative approach has enabled the identification of biologically meaningful associations such as the link between cystic fibrosis and a specific lung cell type, ionocytes, mediated by the CFTR marker gene. By incorporating data from diverse single-cell and multi-omics technologies like ATAC-seq, ChIP-seq, CITE-seq, BS-seq, and others along with genomic risk factors and therapeutic insights, the KG is beginning to reveal how specific cell types contribute to disease pathogenesis.