Open peer review is a growing trend in academic publications. Public access to peer review data can benefit both the academic and publishing communities. It also serves as a great support to studies on review comment generation and further to the realization of automated scholarly paper review. However, most of the existing peer review datasets do not provide data that cover the whole peer review process. Apart from this, their data are not diversified enough as they are mainly collected from the field of computer science. These two drawbacks of the currently available peer review datasets need to be addressed to unlock more opportunities for related studies. In response to this problem, we construct MOPRD, a multidisciplinary open peer review dataset. This dataset consists of paper metadata, multiple version manuscripts, review comments, meta-reviews, author's rebuttal letters, and editorial decisions. Moreover, we design a modular guided review comment generation method based on MOPRD. Experiments show that our method delivers better performance indicated by both automatic metrics and human evaluation. We also explore other potential applications of MOPRD, including meta-review generation, editorial decision prediction, author rebuttal generation, and scientometric analysis. MOPRD is a strong endorsement for further studies in peer review-related research and other applications.
翻译:公开同侪审查是学术出版物中不断增长的趋势。公众获取同侪审查数据对学术界和出版界都有好处。它也有助于大力支持关于审查评论的生成的研究,进一步实现自动的学术性纸质审查。但是,现有的同侪审查数据集大多没有提供涵盖整个同侪审查过程的数据。除此之外,它们的数据不够多样化,因为它们主要从计算机科学领域收集。需要解决目前同侪审查数据集的这两个缺陷,以打开更多的相关研究机会。我们为解决这一问题,建立了多学科的同侪公开审查数据集(MOPRD)。这一数据集包括纸质元数据、多版本的手稿、审查评论、元审查、作者的反驳信和编辑决定。此外,我们还根据MOPRD设计了一个模块化的、有指导的审查评论生成方法,主要从计算机科学领域收集这些数据。实验表明,我们的方法可以提供更佳的业绩,如自动计量和人文评价。我们还探讨了《同侪审查》的其他潜在应用,包括元审查生成、编辑决定预测、作者校对生成、理论性研究和其他同级研究应用中的强有力背书。