We propose to tackle data-to-text generation tasks by directly splicing together retrieved segments of text from "neighbor" source-target pairs. Unlike recent work that conditions on retrieved neighbors but generates text token-by-token, left-to-right, we learn a policy that directly manipulates segments of neighbor text, by inserting or replacing them in partially constructed generations. Standard techniques for training such a policy require an oracle derivation for each generation, and we prove that finding the shortest such derivation can be reduced to parsing under a particular weighted context-free grammar. We find that policies learned in this way perform on par with strong baselines in terms of automatic and human evaluation, but allow for more interpretable and controllable generation.
翻译:我们建议通过直接拼接从“邻居”源目标对“邻居”中检索到的文本部分来应对数据到文本生成的任务。 与最近的工作不同,即对已检索到的邻居有条件,但生成逐字逐句的文本,左对右,我们学习了一种政策,通过在部分建造的世代中插入或替换来直接操控邻国文本部分。 培训这种政策的标准技术要求每一代人都有甲骨文衍生,而且我们证明,找到最短的这种衍生方法可以简化到在特定加权的无背景语法下。 我们发现,以这种方式学习的政策在自动和人文评估方面与强的基线一致,但允许更多可解释和可控制的一代人。