GOLD: 利用数据增加改进在对话中进行范围外探测 (GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation) - 专知论文

会员服务 ·

0

任务对话系统 · Better · Performance · 数据增强 · MoDELS ·

2021 年 9 月 7 日

GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation

翻译：GOLD: 利用数据增加改进在对话中进行范围外探测

Derek Chen,Zhou Yu

from arxiv, 14 pages, 5 figures. Accepted at EMNLP 2021

Practical dialogue systems require robust methods of detecting out-of-scope (OOS) utterances to avoid conversational breakdowns and related failure modes. Directly training a model with labeled OOS examples yields reasonable performance, but obtaining such data is a resource-intensive process. To tackle this limited-data problem, previous methods focus on better modeling the distribution of in-scope (INS) examples. We introduce GOLD as an orthogonal technique that augments existing data to train better OOS detectors operating in low-data regimes. GOLD generates pseudo-labeled candidates using samples from an auxiliary dataset and keeps only the most beneficial candidates for training through a novel filtering mechanism. In experiments across three target benchmarks, the top GOLD model outperforms all existing methods on all key metrics, achieving relative gains of 52.4%, 48.9% and 50.3% against median baseline performance. We also analyze the unique properties of OOS data to identify key factors for optimally applying our proposed method.

翻译：实际对话系统要求用强有力的方法来探测外望(OOS)语句,以避免谈话破裂和相关故障模式。直接以贴有标签的OOS示例进行模型培训可以产生合理的性能,但获得这些数据是一个资源密集型的过程。为了解决这一有限的数据问题,以往的方法侧重于更好地模拟在范围(INS)中实例的分布。我们采用GOLD,作为一种正统技术,以扩大现有数据,培训在低数据系统中运行的更好的OOS探测器。GOLD利用辅助数据集的样本生成假标的候选数据,并且仅保留最有利的候选数据,通过新颖的过滤机制进行培训。在三个目标基准的实验中,顶级GOLD模型比所有关键指标都优于所有现有方法,相对基准性能达到52.4%、48.9%和50.3%的相对收益。我们还分析了OOS数据的独特性,以确定最佳应用我们拟议方法的关键因素。

0

相关内容

任务对话系统

任务对话系统

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

28+阅读 · 2020年6月13日

【实用书】数据科学基础，484页pdf，Foundations of Data Science

【实用书】数据科学基础，484页pdf，Foundations of Data Science

专知会员服务

122+阅读 · 2020年5月28日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

已删除

将门创投

6+阅读 · 2018年12月3日

Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination

Arxiv

0+阅读 · 2021年10月28日

Counterfactual Maximum Likelihood Estimation for Training Deep Networks

Arxiv

0+阅读 · 2021年10月26日

Data Augmentation Approaches in Natural Language Processing: A Survey

Arxiv

18+阅读 · 2021年10月5日

Data Poisoning Attacks and Defenses to Crowdsourcing Systems

Arxiv

8+阅读 · 2021年2月18日

Does Data Augmentation Benefit from Split BatchNorms

Does Data Augmentation Benefit from Split BatchNorms

Arxiv

3+阅读 · 2020年10月15日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Deep Co-Training for Semi-Supervised Image Segmentation

Deep Co-Training for Semi-Supervised Image Segmentation

Arxiv

6+阅读 · 2019年10月30日

Augmentation for small object detection

Augmentation for small object detection

Arxiv

13+阅读 · 2019年2月19日

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Arxiv

3+阅读 · 2018年3月16日

LSTD: A Low-Shot Transfer Detector for Object Detection

Arxiv

4+阅读 · 2018年3月5日

VIP会员

文章信息

相关主题

任务对话系统

相关VIP内容

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

28+阅读 · 2020年6月13日

【实用书】数据科学基础，484页pdf，Foundations of Data Science

【实用书】数据科学基础，484页pdf，Foundations of Data Science

专知会员服务

122+阅读 · 2020年5月28日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能驾驶：旧理念与新技术

美军手册：战术心理战分遣队与小组指南 | 68页

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

美国防部自主系统研制试验与鉴定指南 | 2025年最新200页

相关资讯

已删除

将门创投

6+阅读 · 2018年12月3日

相关论文

Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination

Arxiv

0+阅读 · 2021年10月28日

Counterfactual Maximum Likelihood Estimation for Training Deep Networks

Arxiv

0+阅读 · 2021年10月26日

Data Augmentation Approaches in Natural Language Processing: A Survey

Arxiv

18+阅读 · 2021年10月5日

Data Poisoning Attacks and Defenses to Crowdsourcing Systems

Arxiv

8+阅读 · 2021年2月18日

Does Data Augmentation Benefit from Split BatchNorms

Does Data Augmentation Benefit from Split BatchNorms

Arxiv

3+阅读 · 2020年10月15日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Deep Co-Training for Semi-Supervised Image Segmentation

Deep Co-Training for Semi-Supervised Image Segmentation

Arxiv

6+阅读 · 2019年10月30日

Augmentation for small object detection

Augmentation for small object detection

Arxiv

13+阅读 · 2019年2月19日

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Arxiv

3+阅读 · 2018年3月16日

LSTD: A Low-Shot Transfer Detector for Object Detection

Arxiv

4+阅读 · 2018年3月5日

微信扫码咨询专知VIP会员