Recent successes in deep generative modeling have led to significant advances in natural language generation (NLG). Incorporating entities into neural generation models has demonstrated great improvements by assisting to infer the summary topic and to generate coherent content. In order to enhance the role of entity in NLG, in this paper, we aim to model the entity type in the decoding phase to generate contextual words accurately. We develop a novel NLG model to produce a target sequence (i.e., a news article) based on a given list of entities. The generation quality depends significantly on whether the input entities are logically connected and expressed in the output. Our model has a multi-step decoder that injects the entity types into the process of entity mention generation. It first predicts the token of being a contextual word or an entity, then if an entity, predicts the entity mention. It effectively embeds the entity's meaning into hidden states, making the generated words precise. Experiments on two public datasets demonstrate type injection performs better than type embedding concatenation baselines.
翻译:最近深层基因建模的成功导致自然语言生成(NLG)的显著进步。将实体纳入神经生成模型,通过帮助推断摘要专题和生成一致的内容,显示出了巨大的改进。为了增强实体在本文中在NLG中的作用,我们的目标是在解码阶段对实体类型进行建模,以准确生成相关词。我们开发了一个新的NLG模型,以根据特定实体清单产生目标序列(即新闻报道)。生成质量在很大程度上取决于输入实体在逻辑上是否连接和在输出中表达。我们的模型有一个多步骤的解码器,将实体类型输入实体的生成过程。它首先预测是一个相关词或实体的象征,然后如果一个实体预测该实体的话。它有效地将实体的含义嵌入隐藏在隐藏的状态中,使生成的词更加精确。对两个公共数据集的实验表明,输入类型比嵌入类型基线要好。