Document editing has become a pervasive component of the production of information, with version control systems enabling edits to be efficiently stored and applied. In light of this, the task of learning distributed representations of edits has been recently proposed. With this in mind, we propose a novel approach that employs variational inference to learn a continuous latent space of vector representations to capture the underlying semantic information with regard to the document editing process. We achieve this by introducing a latent variable to explicitly model the aforementioned features. This latent variable is then combined with a document representation to guide the generation of an edited version of this document. Additionally, to facilitate standardized automatic evaluation of edit representations, which has heavily relied on direct human input thus far, we also propose a suite of downstream tasks, PEER, specifically designed to measure the quality of edit representations in the context of natural language processing.
翻译:文件编辑已成为信息制作的一个普遍组成部分,因为版本控制系统使得编辑能够有效地储存和应用。有鉴于此,最近提议了学习分布式编辑表述的任务。有鉴于此,我们提议采用新的方法,采用变式推论方法,学习矢量表达的连续潜在空间,以捕捉与文件编辑过程有关的基本语义信息。我们为此引入了一个潜在变量,以明确模拟上述特征。然后,将这一潜在变量与文件表述结合起来,以指导该文件编辑版的生成。此外,为了便利对迄今为止严重依赖直接人力投入的编辑表述进行标准化的自动评价,我们还提议了一系列下游任务,即PEER,专门设计用来衡量自然语言处理过程中编辑表述的质量。