This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.
翻译:鉴于过去十年来该领域发生的变化,特别是新的(通常以数据为驱动的)方法,以及新应用自然语言技术的情况,本文调查了自然语言一代(NLG)的艺术现状(NLG),被界定为从非语言投入中生成文字或演讲的任务;鉴于过去十年来该领域发生的变化,对自然语言一代(NLG)的调查是及时的,特别是涉及新的(通常以数据为驱动的)方法以及自然语言组技术的新应用,因此,这一调查的目的是:(a) 对关于自然语言组核心任务的研究以及组织这些任务的结构进行最新综合;(b) 强调由于自然语言组和其他人工智能领域之间日益增强的协同作用而部分产生的一些较近期的研究专题;(c) 提请注意自然语言组评价的挑战,将这些挑战与其他自然语言处理领域面临的类似挑战联系起来,重点是不同的评价方法以及它们之间的关系。