Prompt-based methods with large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. These models improve even further with the addition of a few labeled in-context exemplars to guide output generation. However, for more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial, leading to unstable results. Furthermore, building in-context exemplars for dialogue tasks is difficult because conversational contexts are long while model input lengths are relatively short. To overcome these issues we first adapt a meta-learning scheme to the dialogue domain which stabilizes the ability of the model to perform well under various prompts. We additionally design a novel training method to improve upon vanilla retrieval mechanisms to find ideal in-context examples. Finally, we introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query. In effect, we are able to achieve highly competitive results for few-shot DST on MultiWOZ.
翻译:以快速为基础的方法,加上大量经过事先培训的语言模型(PLMs),显示了许多国家劳工计划任务中令人印象深刻的未加帮助的绩效。这些模型随着增加了几个标记的内装模版来指导产出的产生,甚至进一步改进。然而,对于诸如对话状态跟踪(DST)等更为复杂的任务,设计可靠地传达理想意图的提示是非边际的,导致不稳定的结果。此外,为对话任务建立内装模具是困难的,因为对话背景很长,而模型输入长度相对较短。为了克服这些问题,我们首先将一个元学习计划调整到对话领域,以稳定模型在各种提示下良好运行的能力。我们另外设计了一个新的培训方法,以改进香草检索机制,以找到理想的内装样。最后,我们引入了一个限制对话文本长度的突出模型,允许我们每次查询包括更多的Exemplasers。实际上,我们能够在多WOZ上为少数点的DST取得高度竞争性的结果。