分发外文字类型和如何检测这些文字 (Types of Out-of-Distribution Texts and How to Detect Them) - 专知论文

会员服务 ·

0

样例 · 估计/估计量 · 语言模型化 · Performer · MoDELS ·

2021 年 9 月 14 日

Types of Out-of-Distribution Texts and How to Detect Them

翻译：分发外文字类型和如何检测这些文字

Udit Arora,William Huang,He He

from arxiv, EMNLP 2021

Despite agreement on the importance of detecting out-of-distribution (OOD) examples, there is little consensus on the formal definition of OOD examples and how to best detect them. We categorize these examples by whether they exhibit a background shift or a semantic shift, and find that the two major approaches to OOD detection, model calibration and density estimation (language modeling for text), have distinct behavior on these types of OOD data. Across 14 pairs of in-distribution and OOD English natural language understanding datasets, we find that density estimation methods consistently beat calibration methods in background shift settings, while performing worse in semantic shift settings. In addition, we find that both methods generally fail to detect examples from challenge data, highlighting a weak spot for current methods. Since no single method works well across all settings, our results call for an explicit definition of OOD examples when evaluating different detection methods.

翻译：尽管人们一致认为探测分配外(OOD)实例的重要性,但对于OOD实例的正式定义和如何最佳地探测这些实例,几乎没有共识。我们将这些实例分类,方法是显示背景变化还是语义变化,发现OOD检测、模型校准和密度估计(文本语言模型)的两个主要方法在这些类型的OOD数据上有着不同的行为。在分布内和OOD英语自然语言理解数据集的14对中,我们发现密度估计方法始终比背景变化环境中的校准方法强,而在语义转变环境中则表现得更差。此外,我们发现这两种方法一般都无法从挑战数据中发现实例,突出当前方法的弱点。由于没有一种方法在所有环境中都能很好地发挥作用,我们的结果要求在评估不同的探测方法时对OOD示例作出明确的定义。

0

相关内容

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ICLR2020-牛津大学】自动发现和学习新的视觉类别与排名统计，13页pdf，Automatically Discovering and Learning New Visual Categories with Ranking Statistics

【ICLR2020-牛津大学】自动发现和学习新的视觉类别与排名统计，13页pdf，Automatically Discovering and Learning New Visual Categories with Ranking Statistics

专知会员服务

10+阅读 · 2020年2月15日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

4+阅读 · 2018年6月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Unsupervised and Distributional Detection of Machine-Generated Text

Unsupervised and Distributional Detection of Machine-Generated Text

Arxiv

0+阅读 · 2021年11月4日

Bounding Means of Discrete Distributions

Arxiv

0+阅读 · 2021年11月3日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Applying Faster R-CNN for Object Detection on Malaria Images

Applying Faster R-CNN for Object Detection on Malaria Images

Arxiv

5+阅读 · 2019年3月11日

Strong-Weak Distribution Alignment for Adaptive Object Detection

Arxiv

6+阅读 · 2018年12月12日

Polarity Loss for Zero-shot Object Detection

Polarity Loss for Zero-shot Object Detection

Arxiv

3+阅读 · 2018年11月22日

Zero-Shot Object Detection

Zero-Shot Object Detection

Arxiv

9+阅读 · 2018年7月27日

VIP会员

文章信息

相关主题

估计/估计量

语言模型化

相关VIP内容

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ICLR2020-牛津大学】自动发现和学习新的视觉类别与排名统计，13页pdf，Automatically Discovering and Learning New Visual Categories with Ranking Statistics

【ICLR2020-牛津大学】自动发现和学习新的视觉类别与排名统计，13页pdf，Automatically Discovering and Learning New Visual Categories with Ranking Statistics

专知会员服务

10+阅读 · 2020年2月15日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

《印度弹药格局的演变》报告

数据要素发展报告(2025年)：附下载

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

4+阅读 · 2018年6月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Unsupervised and Distributional Detection of Machine-Generated Text

Unsupervised and Distributional Detection of Machine-Generated Text

Arxiv

0+阅读 · 2021年11月4日

Bounding Means of Discrete Distributions

Arxiv

0+阅读 · 2021年11月3日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Applying Faster R-CNN for Object Detection on Malaria Images

Applying Faster R-CNN for Object Detection on Malaria Images

Arxiv

5+阅读 · 2019年3月11日

Strong-Weak Distribution Alignment for Adaptive Object Detection

Arxiv

6+阅读 · 2018年12月12日

Polarity Loss for Zero-shot Object Detection

Polarity Loss for Zero-shot Object Detection

Arxiv

3+阅读 · 2018年11月22日

Zero-Shot Object Detection

Zero-Shot Object Detection

Arxiv

9+阅读 · 2018年7月27日

微信扫码咨询专知VIP会员