诊断模型在分布变化下的性能表现 (Diagnosing Model Performance Under Distribution Shift) - 专知论文

会员服务 ·

0

诊断模型 · 性能下降 · 分布变化 · 分布偏移 · 分解 ·

2023 年 4 月 17 日

Diagnosing Model Performance Under Distribution Shift

翻译：诊断模型在分布变化下的性能表现

Tiffany Tianhui Cai,Hongseok Namkoong,Steve Yadlowsky

Prediction models can perform poorly when deployed to target distributions different from the training distribution. To understand these operational failure modes, we develop a method, called DIstribution Shift DEcomposition (DISDE), to attribute a drop in performance to different types of distribution shifts. Our approach decomposes the performance drop into terms for 1) an increase in harder but frequently seen examples from training, 2) changes in the relationship between features and outcomes, and 3) poor performance on examples infrequent or unseen during training. These terms are defined by fixing a distribution on $X$ while varying the conditional distribution of $Y \mid X$ between training and target, or by fixing the conditional distribution of $Y \mid X$ while varying the distribution on $X$. In order to do this, we define a hypothetical distribution on $X$ consisting of values common in both training and target, over which it is easy to compare $Y \mid X$ and thus predictive performance. We estimate performance on this hypothetical distribution via reweighting methods. Empirically, we show how our method can 1) inform potential modeling improvements across distribution shifts for employment prediction on tabular census data, and 2) help to explain why certain domain adaptation methods fail to improve model performance for satellite image classification.

翻译：预测模型在被部署到与训练分布不同的目标分布时，可能会表现不佳。为了理解这些操作性失效模式，我们开发了一种方法，称为分布偏移分解(DISDE)，以将性能下降归因于不同类型的分布偏移。我们的方法将性能下降分解为以下三个术语：1）训练集中更难但更频繁出现的示例增加；2）特征与结果之间的关系发生变化；3）在训练期间不频繁或未出现的示例表现不佳。这些术语是通过固定 $X$ 上的一个分布来定义的，同时在训练和目标之间变化 $Y|X$ 的条件分布，或者通过固定 $Y|X$ 的条件分布，同时在 $X$ 上变化分布。为此，我们定义了一个假设分布，包含训练和目标中常见的值，可以轻松比较 $Y|X$ 并预测性能。通过重新加权的方法估计在这个假设分布上的性能。在实证方面，我们展示了我们的方法如何：1）为表格式人口调查数据中的就业预测指示潜在的建模改进方法；2）帮助解释为什么一些域适应方法无法改善卫星图像分类模型性能。

0

相关内容

诊断模型

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

KDD20 | 基于差分变量去相关的稳定学习

专知会员服务

20+阅读 · 2021年1月7日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

专知会员服务

37+阅读 · 2020年2月27日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

专知会员服务

13+阅读 · 2019年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【KDD2020-Tutorial】深度学习异常检测，180页ppt

【KDD2020-Tutorial】深度学习异常检测，180页ppt

专知

49+阅读 · 2020年8月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【Awesome】最全的机器学习可解释性资料（machine-learning-interpretability）

【Awesome】最全的机器学习可解释性资料（machine-learning-interpretability）

专知

29+阅读 · 2019年3月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

13+阅读 · 2018年1月27日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

肝纤维化多参数MRI与病理形态学改变的定量对比研究

国家自然科学基金

0+阅读 · 2016年12月31日

交流电作用下纳米复合材料压力传感器的力-电耦合特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

社区轻度认知损害中医证型分布规律及综合干预的有效性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

水电机组变工况性能退化评估与非线性预测研究

国家自然科学基金

1+阅读 · 2013年12月31日

2型糖尿病患者糖化血红蛋白变异度与糖尿病慢性并发症发病风险的关联性及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

植被覆盖地表电磁模型与土壤水分反演研究

国家自然科学基金

0+阅读 · 2012年12月31日

冰云辐射性质参数化对东亚夏季风模拟的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

多层梯度多场耦合纳米复合材料的性能分析及优化设计

国家自然科学基金

0+阅读 · 2011年12月31日

Robust Bayesian Inference for Measurement Error Models

Arxiv

0+阅读 · 2023年6月2日

Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization

Arxiv

0+阅读 · 2023年6月2日

Deep Operator Learning-based Surrogate Models with Uncertainty Quantification for Optimizing Internal Cooling Channel Rib Profiles

Arxiv

0+阅读 · 2023年6月1日

Coin Sampling: Gradient-Based Bayesian Inference without Learning Rates

Arxiv

0+阅读 · 2023年6月1日

F?D: On understanding the role of deep feature spaces on face generation evaluation

Arxiv

0+阅读 · 2023年6月1日

Treasure in Distribution: A Domain Randomization based Multi-Source Domain Generalization for 2D Medical Image Segmentation

Arxiv

0+阅读 · 2023年5月31日

Prediction under hypothetical interventions: evaluation of performance using longitudinal observational data

Arxiv

0+阅读 · 2023年5月31日

Bayesian Image Analysis in Fourier Space

Arxiv

0+阅读 · 2023年5月31日

Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network

Arxiv

0+阅读 · 2023年5月30日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

VIP会员

文章信息

相关主题

相关VIP内容

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

KDD20 | 基于差分变量去相关的稳定学习

专知会员服务

20+阅读 · 2021年1月7日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

专知会员服务

37+阅读 · 2020年2月27日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

专知会员服务

13+阅读 · 2019年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体化人工智能：架构、应用及未来发展方向的综合综述

《自主武器》365页书籍

联邦学习综述：多层次聚合技术的系统分类、实验洞察与未来前沿

人工智能在空战中的局限及其真正适用领域

相关资讯

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【KDD2020-Tutorial】深度学习异常检测，180页ppt

【KDD2020-Tutorial】深度学习异常检测，180页ppt

专知

49+阅读 · 2020年8月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【Awesome】最全的机器学习可解释性资料（machine-learning-interpretability）

【Awesome】最全的机器学习可解释性资料（machine-learning-interpretability）

专知

29+阅读 · 2019年3月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

13+阅读 · 2018年1月27日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Robust Bayesian Inference for Measurement Error Models

Arxiv

0+阅读 · 2023年6月2日

Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization

Arxiv

0+阅读 · 2023年6月2日

Deep Operator Learning-based Surrogate Models with Uncertainty Quantification for Optimizing Internal Cooling Channel Rib Profiles

Arxiv

0+阅读 · 2023年6月1日

Coin Sampling: Gradient-Based Bayesian Inference without Learning Rates

Arxiv

0+阅读 · 2023年6月1日

F?D: On understanding the role of deep feature spaces on face generation evaluation

Arxiv

0+阅读 · 2023年6月1日

Treasure in Distribution: A Domain Randomization based Multi-Source Domain Generalization for 2D Medical Image Segmentation

Arxiv

0+阅读 · 2023年5月31日

Prediction under hypothetical interventions: evaluation of performance using longitudinal observational data

Arxiv

0+阅读 · 2023年5月31日

Bayesian Image Analysis in Fourier Space

Arxiv

0+阅读 · 2023年5月31日

Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network

Arxiv

0+阅读 · 2023年5月30日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

相关基金

肝纤维化多参数MRI与病理形态学改变的定量对比研究

国家自然科学基金

0+阅读 · 2016年12月31日

交流电作用下纳米复合材料压力传感器的力-电耦合特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

社区轻度认知损害中医证型分布规律及综合干预的有效性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

水电机组变工况性能退化评估与非线性预测研究

国家自然科学基金

1+阅读 · 2013年12月31日

2型糖尿病患者糖化血红蛋白变异度与糖尿病慢性并发症发病风险的关联性及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

植被覆盖地表电磁模型与土壤水分反演研究

国家自然科学基金

0+阅读 · 2012年12月31日

冰云辐射性质参数化对东亚夏季风模拟的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

多层梯度多场耦合纳米复合材料的性能分析及优化设计

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员