Data-to-text (D2T) generation is the task of generating texts from structured inputs. We observed that when the same target sentence was repeated twice, Transformer (T5) based model generates an output made up of asymmetric sentences from structured inputs. In other words, these sentences were different in length and quality. We call this phenomenon "Asymmetric Generation" and we exploit this in D2T generation. Once asymmetric sentences are generated, we add the first part of the output with a no-repeated-target. As this goes through progressive edit (ProEdit), the recall increases. Hence, this method better covers structured inputs than before editing. ProEdit is a simple but effective way to improve performance in D2T generation and it achieves the new stateof-the-art result on the ToTTo dataset
翻译:数据到文字( D2T) 生成是通过结构化投入生成文本的任务。 我们观察到,当同一目标句重复两次时,基于变换器( T5) 的模型生成了由结构化投入的不对称句子组成的输出。 换句话说, 这些句子在长度和质量上是不同的。 我们称这种现象为“ 对称一代”, 并在D2T 生成时加以利用。 一旦生成对称句, 我们添加了产出的第一部分, 没有重复目标。 由于这是通过累进编辑( ProEdit), 召回量会增加。 因此, 这个方法比编辑前更好地覆盖结构化输入。 ProEdit 是提高D2T 生成的性能的简单而有效的方法, 它实现了托托数据集的新的最新结果。