We introduce Span-ConveRT, a light-weight model for dialog slot-filling which frames the task as a turn-based span extraction task. This formulation allows for a simple integration of conversational knowledge coded in large pretrained conversational models such as ConveRT (Henderson et al., 2019). We show that leveraging such knowledge in Span-ConveRT is especially useful for few-shot learning scenarios: we report consistent gains over 1) a span extractor that trains representations from scratch in the target domain, and 2) a BERT-based span extractor. In order to inspire more work on span extraction for the slot-filling task, we also release RESTAURANTS-8K, a new challenging data set of 8,198 utterances, compiled from actual conversations in the restaurant booking domain.
翻译:我们引入了Span-ConveRT(Span-ConveRT),这是一个用于填补对话位置的轻量级模型,该模型将任务设定为基于转接的抽取任务。这一配方可以简单整合在ConveRT(Henderson等人,2019年)等大型预先培训的谈话模式中编码的谈话知识。我们显示,在Span-ConveRT中利用这种知识对于几近的学习情景特别有用:我们报告在1个跨段上取得一致的成绩,1个抽取器从目标区域从零到零地进行显示,2个基于BERT的抽取器。为了激励更多关于抽取空隙任务的工作,我们还发布了ESTAURantS-8K,这是一套具有挑战性的新数据,有8,198的发自餐厅预订领域的实际谈话汇编的8,198个发音。