Neural text generation models are often autoregressive language models or seq2seq models. These models generate text by sampling words sequentially, with each word conditioned on the previous word, and are state-of-the-art for several machine translation and summarization benchmarks. These benchmarks are often defined by validation perplexity even though this is not a direct measure of the quality of the generated text. Additionally, these models are typically trained via maxi- mum likelihood and teacher forcing. These methods are well-suited to optimizing perplexity but can result in poor sample quality since generating text requires conditioning on sequences of words that may have never been observed at training time. We propose to improve sample quality using Generative Adversarial Networks (GANs), which explicitly train the generator to produce high quality samples and have shown a lot of success in image generation. GANs were originally designed to output differentiable values, so discrete language generation is challenging for them. We claim that validation perplexity alone is not indicative of the quality of text generated by a model. We introduce an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context. We show qualitatively and quantitatively, evidence that this produces more realistic conditional and unconditional text samples compared to a maximum likelihood trained model.
翻译:神经文本生成模型往往是自动递进语言模型或后继2seq 模型。 这些模型通过抽样单词生成文本,每个单词都以前一个单词为条件。 这些模型是几个机器翻译和总化基准的最先进的。 这些基准往往通过验证来界定, 尽管这不是对生成文本质量的直接衡量。 此外, 这些模型通常通过最大可能性和教师强迫来培训。 这些方法非常适合优化不统一性, 但可能导致样本质量低劣, 因为生成文本需要按培训时可能从未观察到的单词顺序来设置。 我们建议使用精度自动转换网络( GANs) 来改进样本质量, 该网络明确训练生成者制作高质量样本, 并在图像生成中表现出很大的成功性。 GANs 最初设计这些模型是为了输出不同值, 因此, 离散语言生成对他们来说具有挑战性。 这些方法非常适合优化, 但它们并不代表模型生成的文本质量。 我们引入了一种有条件的、 有条件的GAN 模型, 以最符合质量和最精确的样本来显示我们所缺少的文本的量化的可能性。