混合组合何时和如何改进校准 (When and How Mixup Improves Calibration) - 专知论文

会员服务 ·

0

Mixup · 置信度 · MoDELS · 得分 · 学成 ·

2021 年 10 月 22 日

When and How Mixup Improves Calibration

翻译：混合组合何时和如何改进校准

Linjun Zhang,Zhun Deng,Kenji Kawaguchi,James Zou

In many machine learning applications, it is important for the model to provide confidence scores that accurately capture its prediction uncertainty. Although modern learning methods have achieved great success in predictive accuracy, generating calibrated confidence scores remains a major challenge. Mixup, a popular yet simple data augmentation technique based on taking convex combinations of pairs of training examples, has been empirically found to significantly improve confidence calibration across diverse applications. However, when and how Mixup helps calibration is still a mystery. In this paper, we theoretically prove that Mixup improves calibration in \textit{high-dimensional} settings by investigating natural statistical models. Interestingly, the calibration benefit of Mixup increases as the model capacity increases. We support our theories with experiments on common architectures and datasets. In addition, we study how Mixup improves calibration in semi-supervised learning. While incorporating unlabeled data can sometimes make the model less calibrated, adding Mixup training mitigates this issue and provably improves calibration. Our analysis provides new insights and a framework to understand Mixup and calibration.

翻译：在许多机器学习应用中,模型必须提供准确测量预测不确定性的自信分数。尽管现代学习方法在预测准确性方面取得了巨大成功,但获得校准信心分数仍是一项重大挑战。混合是一种流行而简单的数据增强技术,其基础是采用各种培训范例的组合组合组合,在经验上发现这种技术可以大大改善不同应用中的信任度校准。然而,当混ixup帮助校准时和如何帮助校准仍然是个谜。在本文中,我们理论上证明混ixup通过调查自然统计模型改进了 mixup 环境的校准。有趣的是,混合的校准效益随着模型能力的增长而增加。我们用共同结构和数据集的实验来支持我们的理论。此外,我们研究混成如何改进半监视学习中的校准。在纳入无标签数据的同时,有时会降低模型的校准,加上混成培训会减轻这一问题,并可以改进校准。我们的分析提供了新的洞察和校准框架。

0

相关内容

Mixup

知识图谱嵌入模型的概率标定,Probability Calibration for Knowledge Graph Embedding Models

专知会员服务

36+阅读 · 2020年5月11日

[CVPR 2020 Oral-牛津] RandLA-Net:大场景三维点云语义分割新框架

[CVPR 2020 Oral-牛津] RandLA-Net:大场景三维点云语义分割新框架

专知会员服务

26+阅读 · 2020年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【KDD2019|讲座推荐】从生产规模神经网络中发现知识的统计学习方法：Statistical Mechanics Methods for Discovering Knowledge from Production-Scale Neural Networks

【KDD2019|讲座推荐】从生产规模神经网络中发现知识的统计学习方法：Statistical Mechanics Methods for Discovering Knowledge from Production-Scale Neural Networks

专知会员服务

18+阅读 · 2019年12月4日

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

专知会员服务

24+阅读 · 2019年11月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】卷积神经网络类间不平衡问题系统研究

【推荐】卷积神经网络类间不平衡问题系统研究

机器学习研究会

6+阅读 · 2017年10月18日

已删除

将门创投

7+阅读 · 2017年7月11日

Neural Mean Discrepancy for Efficient Out-of-Distribution Detection

Arxiv

0+阅读 · 2021年12月20日

Classifier Calibration: How to assess and improve predicted class probabilities: a survey

Arxiv

0+阅读 · 2021年12月20日

Penalized Projected Kernel Calibration for Computer Models

Arxiv

0+阅读 · 2021年12月17日

Pure Noise to the Rescue of Insufficient Data: Improving Imbalanced Classification by Training on Random Noise Images

Arxiv

0+阅读 · 2021年12月16日

Analyzing the Limits of Self-Supervision in Handling Bias in Language

Arxiv

0+阅读 · 2021年12月16日

Can Multilinguality benefit Non-autoregressive Machine Translation?

Arxiv

0+阅读 · 2021年12月16日

Does Data Augmentation Benefit from Split BatchNorms

Does Data Augmentation Benefit from Split BatchNorms

Arxiv

3+阅读 · 2020年10月15日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Arxiv

11+阅读 · 2019年11月25日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Learning to Importance Sample in Primary Sample Space

Learning to Importance Sample in Primary Sample Space

Arxiv

5+阅读 · 2018年8月23日

VIP会员

文章信息

相关主题

相关VIP内容

知识图谱嵌入模型的概率标定,Probability Calibration for Knowledge Graph Embedding Models

专知会员服务

36+阅读 · 2020年5月11日

[CVPR 2020 Oral-牛津] RandLA-Net:大场景三维点云语义分割新框架

[CVPR 2020 Oral-牛津] RandLA-Net:大场景三维点云语义分割新框架

专知会员服务

26+阅读 · 2020年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【KDD2019|讲座推荐】从生产规模神经网络中发现知识的统计学习方法：Statistical Mechanics Methods for Discovering Knowledge from Production-Scale Neural Networks

【KDD2019|讲座推荐】从生产规模神经网络中发现知识的统计学习方法：Statistical Mechanics Methods for Discovering Knowledge from Production-Scale Neural Networks

专知会员服务

18+阅读 · 2019年12月4日

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

专知会员服务

24+阅读 · 2019年11月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】卷积神经网络类间不平衡问题系统研究

【推荐】卷积神经网络类间不平衡问题系统研究

机器学习研究会

6+阅读 · 2017年10月18日

已删除

将门创投

7+阅读 · 2017年7月11日

相关论文

Neural Mean Discrepancy for Efficient Out-of-Distribution Detection

Arxiv

0+阅读 · 2021年12月20日

Classifier Calibration: How to assess and improve predicted class probabilities: a survey

Arxiv

0+阅读 · 2021年12月20日

Penalized Projected Kernel Calibration for Computer Models

Arxiv

0+阅读 · 2021年12月17日

Pure Noise to the Rescue of Insufficient Data: Improving Imbalanced Classification by Training on Random Noise Images

Arxiv

0+阅读 · 2021年12月16日

Analyzing the Limits of Self-Supervision in Handling Bias in Language

Arxiv

0+阅读 · 2021年12月16日

Can Multilinguality benefit Non-autoregressive Machine Translation?

Arxiv

0+阅读 · 2021年12月16日

Does Data Augmentation Benefit from Split BatchNorms

Does Data Augmentation Benefit from Split BatchNorms

Arxiv

3+阅读 · 2020年10月15日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Arxiv

11+阅读 · 2019年11月25日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Learning to Importance Sample in Primary Sample Space

Learning to Importance Sample in Primary Sample Space

Arxiv

5+阅读 · 2018年8月23日

微信扫码咨询专知VIP会员