Multi-document scientific summarization can extract and organize important information from an abundant collection of papers, arousing widespread attention recently. However, existing efforts focus on producing lengthy overviews lacking a clear and logical hierarchy. To alleviate this problem, we present an atomic and challenging task named Hierarchical Catalogue Generation for Literature Review (HiCatGLR), which aims to generate a hierarchical catalogue for a review paper given various references. We carefully construct a novel English Hierarchical Catalogues of Literature Reviews Dataset (HiCaD) with 13.8k literature review catalogues and 120k reference papers, where we benchmark diverse experiments via the end-to-end and pipeline methods. To accurately assess the model performance, we design evaluation metrics for similarity to ground truth from semantics and structure. Besides, our extensive analyses verify the high quality of our dataset and the effectiveness of our evaluation metrics. Furthermore, we discuss potential directions for this task to motivate future research.
翻译:多文档科学摘要可以从丰富的论文集中提取和组织重要信息,近年来引起了广泛关注。然而,现有的努力都集中在产生缺乏明确和逻辑层次结构的冗长概述上。为了缓解这个问题,我们提出了一个原子而具有挑战性的任务,名为基于多个参考文献的文献综述HiCatGLR的层次目录生成,旨在给出一篇综述论文的层次目录。我们通过全流程和管道方法设计多样化实验,利用英语层次文献综述数据集(HiCaD)进行基准测试,其中包括13.8k个文献综述目录和120k个参考论文。为了准确评估模型的性能,我们从语义和结构的相似性设计了评估指标。此外,我们的广泛分析验证了数据集的高质量和评估指标的有效性。此外,我们讨论了该任务的潜在方向,以激发未来的研究。