MS2:多文件医学研究摘要 (MS2: Multi-Document Summarization of Medical Studies)

from arxiv, 8 pages of content, 20 pages including references and appendix. See https://github.com/allenai/ms2/ for code, https://ai2-s2-ms2.s3-us-west-2.amazonaws.com/ms_data_2021-04-12.zip for data (1.8G, zipped) Published in EMNLP 2021 @ https://aclanthology.org/2021.emnlp-main.594/

To assess the effectiveness of any medical intervention, researchers must conduct a time-intensive and highly manual literature review. NLP systems can help to automate or assist in parts of this expensive process. In support of this goal, we release MS^2 (Multi-Document Summarization of Medical Studies), a dataset of over 470k documents and 20k summaries derived from the scientific literature. This dataset facilitates the development of systems that can assess and aggregate contradictory evidence across multiple studies, and is the first large-scale, publicly available multi-document summarization dataset in the biomedical domain. We experiment with a summarization system based on BART, with promising early results. We formulate our summarization inputs and targets in both free text and structured forms and modify a recently proposed metric to assess the quality of our system's generated summaries. Data and models are available at https://github.com/allenai/ms2

翻译：为了评估任何医疗干预的有效性,研究人员必须进行时间密集和高度人工的文献审查。国家实验室方案系统可以帮助在这一昂贵过程的某些部分实现自动化或提供协助。为了支持这一目标,我们以免费文本和结构化格式发布MS2(医学研究多文件摘要)、470k多份文件数据集和从科学文献中得出的20k摘要。这一数据集有助于发展能够评估和汇总多种研究之间相互矛盾证据的系统,也是生物医学领域第一个大规模、可公开获取的多文件汇总数据集。我们试验基于BART的汇总系统,并取得有希望的早期结果。我们以免费文本和结构化形式制定我们的汇总投入和目标,并修改最近提出的评估我们系统生成摘要质量的一套指标。数据和模型见https://github.com/allenai/ms2。数据和模型见https://github.allenai/ms2。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

专知会员服务

39+阅读 · 2020年11月3日