Neural text generation models are typically trained by maximizing log-likelihood with the sequence cross entropy loss, which encourages an exact token-by-token match between a target sequence with a generated sequence. Such training objective is sub-optimal when the target sequence not perfect, e.g., when the target sequence is corrupted with noises, or when only weak sequence supervision is available. To address this challenge, we propose a novel Edit-Invariant Sequence Loss (EISL), which computes the matching loss of a target n-gram with all n-grams in the generated sequence. EISL draws inspirations from convolutional networks (ConvNets) which are shift-invariant to images, hence is robust to the shift of n-grams to tolerate edits in the target sequences. Moreover, the computation of EISL is essentially a convolution operation with target n-grams as kernels, which is easy to implement with existing libraries. To demonstrate the effectiveness of EISL, we conduct experiments on three tasks: machine translation with noisy target sequences, unsupervised text style transfer, and non-autoregressive machine translation. Experimental results show our method significantly outperforms cross entropy loss on these three tasks.
翻译:神经文本生成模型通常通过最大限度地使日志相似性与序列交叉损耗的序列相匹配来培训神经文本生成模型,这种模型鼓励在目标序列与生成序列之间进行精确的象征性逐字匹配。当目标序列不完美时,这种培训目标是亚最佳的,例如当目标序列被噪音破坏时,或者当目标序列受到微弱的序列监督时。为了应对这一挑战,我们提议了一个新颖的编辑-变量序列损失(EISL),它计算目标正克与生成序列中的所有正克相匹配的损失。EISL从正变换到图像的同变网络(Conval Nets)中提取灵感。因此,这种培训目标序列不完美,例如当目标序列中的目标序列被破坏时,或者当目标序列监管不完善。此外,EISL的计算基本上是一种以目标n-变量序列为内核的演算动作(EISL),这很容易与现有的图书馆一起执行。为了证明 EISL的有效性,我们实验了三项任务:机器翻译,与紧凑的目标序列,不精确的文本样式转换。