We present AGGGEN (pronounced 'again'), a data-to-text model which re-introduces two explicit sentence planning stages into neural data-to-text systems: input ordering and input aggregation. In contrast to previous work using sentence planning, our model is still end-to-end: AGGGEN performs sentence planning at the same time as generating text by learning latent alignments (via semantic facts) between input representation and target text. Experiments on the WebNLG and E2E challenge data show that by using fact-based alignments our approach is more interpretable, expressive, robust to noise, and easier to control, while retaining the advantages of end-to-end systems in terms of fluency. Our code is available at https://github.com/XinnuoXu/AggGen.
翻译:我们提出了AGGGEN(宣布为“再次”)数据到文字模型,该模型将两个明确的句子规划阶段重新引入神经数据到文字系统:输入订购和输入汇总。与以往使用句子规划的工作不同,我们的模式仍然是端到端:AGGGEN在通过学习输入表示和目标文本之间的潜在一致性(通过语义事实)同时执行句子规划。WebNLG和E2E挑战数据的实验表明,通过使用基于事实的对齐,我们的方法更易解释、明确、对噪音更强大、更易控制,同时保留端到端系统在流畅方面的优势。我们的代码可在https://github.com/XinnuoXu/AggGen查阅。