Generative Adversarial Networks (GANs) have known a tremendous success for many continuous generation tasks, especially in the field of image generation. However, for discrete outputs such as language, optimizing GANs remains an open problem with many instabilities, as no gradient can be properly back-propagated from the discriminator output to the generator parameters. An alternative is to learn the generator network via reinforcement learning, using the discriminator signal as a reward, but such a technique suffers from moving rewards and vanishing gradient problems. Finally, it often falls short compared to direct maximum-likelihood approaches. In this paper, we introduce Generative Cooperative Networks, in which the discriminator architecture is cooperatively used along with the generation policy to output samples of realistic texts for the task at hand. We give theoretical guarantees of convergence for our approach, and study various efficient decoding schemes to empirically achieve state-of-the-art results in two main NLG tasks.
翻译:许多连续生成的生成网络(GANs)取得了巨大成功,特别是在图像生成领域,但是,对于语言等离散产出而言,优化GANs仍然是许多不稳定因素的一个未解决的问题,因为任何梯度都无法从歧视输出到生成参数之间适当地反向宣传。另一个办法是通过强化学习学习发电机网络,使用歧视信号作为奖励,但这种技术受到感动奖励和消失梯度问题的影响。最后,它往往与直接的尽可能相似的方法相比不足。在本文件中,我们引入了创新合作网络,在这个网络中,歧视者结构与生成政策一起用于输出当前任务的现实文本样本。我们从理论上保证我们的方法趋于一致,并研究各种高效的解码计划,以经验方式在两项主要NLG任务中取得最新成果。