防雨:利用分配外数据生成的盾牌文本生成器保护伞。 (Rainproof: An Umbrella To Shield Text Generators From Out-Of-Distribution Data)

As more and more conversational and translation systems are deployed in production, it is essential to implement and to develop effective control mechanisms guaranteeing their proper functioning and security. An essential component to ensure safe system behavior is out-of-distribution (OOD) detection, which aims at detecting whether an input sample is statistically far from the training distribution. Although OOD detection is a widely covered topic in classification tasks, it has received much less attention in text generation. This paper addresses the problem of OOD detection for machine translation and dialog generation from an operational perspective. Our contributions include: (i) RAINPROOF a Relative informAItioN Projection ODD detection framework; and (ii) a more operational evaluation setting for OOD detection. Surprisingly, we find that OOD detection is not necessarily aligned with task-specific measures. The OOD detector may filter out samples that are well processed by the model and keep samples that are not, leading to weaker performance. Our results show that RAINPROOF breaks this curse and achieve good results in OOD detection while increasing performance.

翻译：由于在生产过程中越来越多地使用谈话和翻译系统,必须实施和发展有效的控制机制,保证其正常运行和安全,确保安全系统行为的一个基本组成部分是超出分配(OOOD)检测,目的是检测输入样本是否在统计上远低于培训分布,尽管OOD检测在分类任务中是一个广泛覆盖的专题,但在生成文本方面却很少受到重视。本文从操作角度探讨机器翻译和生成对话OOOD检测的问题。我们的贡献包括:(一) RAINPROO 相对信息投影ODD检测框架;(二) 更多的OOOD检测操作性评估环境。令人惊讶的是,我们发现OOOD检测不一定与具体任务措施一致。OOD探测器可能筛选出模型处理良好的样本,并保存不完善的样本,导致性能减弱。我们的结果显示,RAINPROOPO打破了这一诅咒,并在OD检测中取得良好结果,同时提高性能。

相关内容

Performance

关注 3

Performance：International Symposium on Computer Performance Modeling, Measurements and Evaluation。 Explanation：计算机性能建模、测量和评估国际研讨会。 Publisher：ACM。 SIT：http://dblp.uni-trier.de/db/conf/performance/

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日