Recent advances in deep neural language models combined with the capacity of large scale datasets have accelerated the development of natural language generation systems that produce fluent and coherent texts (to various degrees of success) in a multitude of tasks and application contexts. However, controlling the output of these models for desired user and task needs is still an open challenge. This is crucial not only to customizing the content and style of the generated language, but also to their safe and reliable deployment in the real world. We present an extensive survey on the emerging topic of constrained neural language generation in which we formally define and categorize the problems of natural language generation by distinguishing between conditions and constraints (the latter being testable conditions on the output text instead of the input), present constrained text generation tasks, and review existing methods and evaluation metrics for constrained text generation. Our aim is to highlight recent progress and trends in this emerging field, informing on the most promising directions and limitations towards advancing the state-of-the-art of constrained neural language generation research.
翻译:最近深层神经语言模型的进展,加上大规模数据集的能力,加速了自然语言生成系统的发展,这些系统在多种任务和应用背景下产生流畅和一致的文本(在不同程度上取得成功),然而,控制这些模型的输出以满足理想用户和任务需求,仍是一个公开的挑战,这不仅对定制所生成语言的内容和风格,而且对在现实世界中安全可靠地部署这些语言至关重要。我们广泛调查了新兴的有限神经语言生成问题,我们通过区分条件和制约因素(后者是产出文本而不是投入的可测试条件),正式界定和分类自然语言生成问题,提出有限的文本生成任务,审查现有方法和评估标准,以适应受限制的文本生成,目的是突出这一新兴领域的最新进展和趋势,通报推进受限制的神经语言生成研究的最有希望的方向和限制。