Neural text generation models are typically trained by maximizing log-likelihood with the sequence cross entropy (CE) loss, which encourages an exact token-by-token match between a target sequence with a generated sequence. Such training objective is sub-optimal when the target sequence is not perfect, e.g., when the target sequence is corrupted with noises, or when only weak sequence supervision is available. To address the challenge, we propose a novel Edit-Invariant Sequence Loss (EISL), which computes the matching loss of a target n-gram with all n-grams in the generated sequence. EISL is designed to be robust to various noises and edits in the target sequences. Moreover, the EISL computation is essentially an approximate convolution operation with target n-grams as kernels, which is easy to implement and efficient to compute with existing libraries. To demonstrate the effectiveness of EISL, we conduct experiments on a wide range of tasks, including machine translation with noisy target sequences, unsupervised text style transfer with only weak training signals, and non-autoregressive generation with non-predefined generation order. Experimental results show our method significantly outperforms the common CE loss and other strong baselines on all the tasks. EISL has a simple API that can be used as a drop-in replacement of the CE loss: https://github.com/guangyliu/EISL.
翻译:神经文本生成模型通常通过尽可能扩大对数类型来培训,使日志类型与顺序交叉键(CE)损失相匹配,这鼓励了目标序列与生成序列之间精确的象征性对齐。在目标序列不完美时,这种培训目标是亚最佳的,例如,目标序列因噪音而损坏,或者只有微弱的序列监督功能而容易执行,或者只有微弱的序列监管。为了应对这一挑战,我们提议了一个新颖的编辑-变量序列损失(EISL),它计算目标正方正方格与生成序列中所有正方格的匹配损失。EISL的设计是为了对目标序列中各种噪音和编辑的稳健。此外,EISL的计算基本上是一个近似演进操作,目标正数正数正值为内核,很容易执行,而且能与现有图书馆相容。为了证明EIS的快速转换、不超超强的文本文本样式传输信号,以及非AVALILA和不易定义的LEDRADRA