NLM DIR Seminar Schedule

Seminars Home

Schedule Seminar

UPCOMING SEMINARS

July 15, 2025 Noam Rotenberg
Cell phenotypes in the biomedical literature: a systematic analysis and the NLM CellLink text mining corpus

RECENT SEMINARS

July 15, 2025 Noam Rotenberg
Cell phenotypes in the biomedical literature: a systematic analysis and the NLM CellLink text mining corpus
July 3, 2025 Matthew Diller
Using Ontologies to Make Knowledge Computable
July 1, 2025 Yoshitaka Inoue
Graph-Aware Interpretable Drug Response Prediction and LLM-Driven Multi-Agent Drug-Target Interaction Prediction
June 10, 2025 Aleksandra Foerster
Interactions at pre-bonding distances and bond formation for open p-shell atoms: a step toward biomolecular interaction modeling using electrostatics
June 3, 2025 MG Hirsch
Interactions among subclones and immunity controls melanoma progression

Scheduled Seminars on Nov. 1, 2022

Speaker

Kush Attal

PI/Lab

Time

11 a.m.

Presentation Title

Presenting a Dataset for Plain Language Adaptation of Scientific Text

Location

Virtual - see link below

Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.

Abstract:

Although a growing amount of health-related literature has been made available to a large audience online, the language of scientific articles can be difficult for the general public to comprehend. Thus, simplifying and adapting this expert-level language into plain language versions is needed for the public to reliably understand the vast health-related literature. Machine and Deep Learning algorithms for automatic adaptation are a possible solution; however, gold standard datasets are needed to properly evaluate their performances. Current datasets consist of either pairs of comparable professional- and general public-facing documents or pairs of semantically similar sentences mined from such documents. This creates a trade-off between imperfect alignments and small test sets. To address this issue, we created the Plain Language Adaptation of Biomedical Abstracts dataset. This dataset is the first manually adapted dataset that is both document- and sentence-aligned. It contains 750 adapted abstracts, totaling 7643 sentence pairs. Along with describing the dataset, we benchmark state-of-the-art Deep Learning approaches on the dataset, setting baselines for future research.

NLM DIR Seminar Schedule

UPCOMING SEMINARS

RECENT SEMINARS

Scheduled Seminars on Nov. 1, 2022

Abstract:

ARCHIVES