Peer review is a key component of the publishing process in most fields of science. The increasing submission rates put a strain on reviewing quality and efficiency, motivating the development of applications to support the reviewing and editorial work. While existing NLP studies focus on the analysis of individual texts, editorial assistance often requires modeling interactions between pairs of texts -- yet general frameworks and datasets to support this scenario are missing. Relationships between texts are the core object of the intertextuality theory -- a family of approaches in literary studies not yet operationalized in NLP. Inspired by prior theoretical work, we propose the first intertextual model of text-based collaboration, which encompasses three major phenomena that make up a full iteration of the review-revise-and-resubmit cycle: pragmatic tagging, linking and long-document version alignment. While peer review is used across the fields of science and publication formats, existing datasets solely focus on conference-style review in computer science. Addressing this, we instantiate our proposed model in the first annotated multi-domain corpus in journal-style post-publication open peer review, and provide detailed insights into the practical aspects of intertextual annotation. Our resource is a major step towards multi-domain, fine-grained applications of NLP in editorial support for peer review, and our intertextual framework paves the path for general-purpose modeling of text-based collaboration.
翻译:提交率的提高给审查质量和效率带来压力,促使开发应用软件以支持审查和编辑工作。虽然现有的《国家采购计划》研究侧重于分析个别文本,但编辑协助往往要求对文本进行模拟互动 -- -- 但缺少支持这一设想的一般框架和数据集。文本之间的关系是互通性理论的核心内容 -- -- 文学研究中尚未在《国家采购计划》中实施的一套方法。在先前的理论工作启发下,我们提出了第一个基于文本的合作的通俗模式,其中包括三种主要现象,构成审查 -- -- 审阅 -- -- 和复述周期的全面迭代:务实的标记、链接和长文本协调。虽然同行审议用于科学和出版格式的各个领域,但现有的数据集仅侧重于计算机科学的会议式审查。我们首先在以日记式为基础的后公开同行审议后附加说明的多文本模式中,为我们同行审议的主要版本提供详细见解,为我们同行审议的主要版本的多文本修正版本提供我们的主要版本版本版本。