In this paper, we explore how to use a small amount of new data to update a task-oriented semantic parsing model when the desired output for some examples has changed. When making updates in this way, one potential problem that arises is the presence of conflicting data, or out-of-date labels in the original training set. To evaluate the impact of this understudied problem, we propose an experimental setup for simulating changes to a neural semantic parser. We show that the presence of conflicting data greatly hinders learning of an update, then explore several methods to mitigate its effect. Our multi-task and data selection methods lead to large improvements in model accuracy compared to a naive data-mixing strategy, and our best method closes 86% of the accuracy gap between this baseline and an oracle upper bound.
翻译:在本文中, 我们探索如何使用少量新数据来更新任务导向的语义解析模型, 当某些示例的预期输出发生变化时, 如何更新任务导向的语义解析模型 。 这样进行更新时, 可能出现的一个潜在问题就是存在相互矛盾的数据, 或者在原始培训集中出现过时的标签 。 为了评估这个研究不足的问题的影响, 我们建议建立一个实验性设置, 模拟神经语义解析器的变化 。 我们发现, 存在相互矛盾的数据会极大地阻碍对更新的学习, 然后探索几种方法来减轻更新的效果 。 我们的多任务和数据选择方法导致模型准确性与天真的数据混合战略相比大有改进, 我们的最佳方法可以缩小这个基准和一个天真的顶端之间的精确差距的86% 。