强力对话国家跟踪,以弱监管和粗数据跟踪 (Robust Dialogue State Tracking with Weak Supervision and Sparse Data)

Generalising dialogue state tracking (DST) to new data is especially challenging due to the strong reliance on abundant and fine-grained supervision during training. Sample sparsity, distributional shift and the occurrence of new concepts and topics frequently lead to severe performance degradation during inference. In this paper we propose a training strategy to build extractive DST models without the need for fine-grained manual span labels. Two novel input-level dropout methods mitigate the negative impact of sample sparsity. We propose a new model architecture with a unified encoder that supports value as well as slot independence by leveraging the attention mechanism. We combine the strengths of triple copy strategy DST and value matching to benefit from complementary predictions without violating the principle of ontology independence. Our experiments demonstrate that an extractive DST model can be trained without manual span labels. Our architecture and training strategies improve robustness towards sample sparsity, new concepts and topics, leading to state-of-the-art performance on a range of benchmarks. We further highlight our model's ability to effectively learn from non-dialogue data.

翻译：由于在培训期间高度依赖大量和精细的监管,普遍对话状态跟踪(DST)对新数据尤其具有挑战性,因为培训期间大量依赖大量和精细的监督。抽样宽度、分布式转移和新概念和专题的出现往往导致在推断过程中严重性能退化。在本文件中,我们提议了一项培训战略,以建立采掘的DST模型,而不需要细微的手工横幅标签。两种新的投入级的辍学方法减轻了样本散射的负面影响。我们提出了一个新的模型结构,配有一个统一的编码器,通过利用关注机制支持价值和空档独立。我们结合了三重制的DST战略的优势和匹配值的优势,以便从补充性预测中受益,而不违反本体独立原则。我们的实验表明,可不使用人工横幅标签对采掘的DST模型进行培训。我们的架构和培训战略提高了对样本宽度、新概念和专题的稳健性,导致在一系列基准上达到最新业绩。我们进一步强调我们的模型从非对话数据中有效学习的能力。

相关内容

DST (Digital Sky Technologies)

关注 1

DST ( Digital Sky Technologies) 为一家俄罗斯科技、投资公司，创始人为 Yuri Milner。2010 年，DST 将旗下邮件服务和投资职能拆分为 http://Mail.ru Group 和 DST Global 两家公司。 DST 曾投资过 Facebook、Twitter、Groupon、Airbnb、Spotify、Zynga、Flipkart、阿里巴巴、京东等知名科技互联网企业。

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日