This work focuses on relating two mysteries in neural-based text generation: exposure bias, and text degeneration. Despite the long time since exposure bias was mentioned and the numerous studies for its remedy, to our knowledge, its impact on text generation has not yet been verified. Text degeneration is a problem that the widely-used pre-trained language model GPT-2 was recently found to suffer from (Holtzman et al., 2020). Motivated by the unknown causation of the text degeneration, in this paper we attempt to relate these two mysteries. Specifically, we first qualitatively quantitatively identify mistakes made before text degeneration occurs. Then we investigate the significance of the mistakes by inspecting the hidden states in GPT-2. Our results show that text degeneration is likely to be partly caused by exposure bias. We also study the self-reinforcing mechanism of text degeneration, explaining why the mistakes amplify. In sum, our study provides a more concrete foundation for further investigation on exposure bias and text degeneration problems.
翻译:这项工作侧重于将基于神经的文本生成中的两个奥秘联系起来:接触偏差和文本变换。尽管接触偏差被提及已有很长时间,而且对其补救的多项研究,对我们的知识来说,对文本生成的影响尚未核实。 文本变换是最近发现广泛使用的经过培训的通用语言模型GPT-2受到的问题(Holtzman等人,2020年),受文本变换的未知原因的驱动,我们在本文中试图将这两个奥秘联系起来。具体地说,我们首先从质量上确定文本变换发生前的错误。然后我们通过检查GPT-2的隐藏状态来调查错误的重要性。我们的结果显示,文本变换可能部分是由于暴露偏差造成的。我们还研究了文本变换的自我强化机制,解释了错误扩大的原因。总而言之,我们的研究为进一步调查暴露偏差和文本变换问题提供了更具体的基础。