利用特定类信任分数估算在域变换下的模型性绩效 (Estimating Model Performance under Domain Shifts with Class-Specific Confidence Scores) - 专知论文

会员服务 ·

0

估计/估计量 · Performer · 置信度 · domain shift · 模型评估 ·

2022 年 7 月 20 日

Estimating Model Performance under Domain Shifts with Class-Specific Confidence Scores

翻译：利用特定类信任分数估算在域变换下的模型性绩效

Zeju Li,Konstantinos Kamnitsas,Mobarakol Islam,Chen Chen,Ben Glocker

from arxiv, Accepted at MICCAI 2022

Machine learning models are typically deployed in a test setting that differs from the training setting, potentially leading to decreased model performance because of domain shift. If we could estimate the performance that a pre-trained model would achieve on data from a specific deployment setting, for example a certain clinic, we could judge whether the model could safely be deployed or if its performance degrades unacceptably on the specific data. Existing approaches estimate this based on the confidence of predictions made on unlabeled test data from the deployment's domain. We find existing methods struggle with data that present class imbalance, because the methods used to calibrate confidence do not account for bias induced by class imbalance, consequently failing to estimate class-wise accuracy. Here, we introduce class-wise calibration within the framework of performance estimation for imbalanced datasets. Specifically, we derive class-specific modifications of state-of-the-art confidence-based model evaluation methods including temperature scaling (TS), difference of confidences (DoC), and average thresholded confidence (ATC). We also extend the methods to estimate Dice similarity coefficient (DSC) in image segmentation. We conduct experiments on four tasks and find the proposed modifications consistently improve the estimation accuracy for imbalanced datasets. Our methods improve accuracy estimation by 18\% in classification under natural domain shifts, and double the estimation accuracy on segmentation tasks, when compared with prior methods.

翻译：机械学习模型通常在不同于培训设置的测试环境中部署,可能会导致因域变换而降低模型性能。如果我们能够估计一个经过预先训练的模型在特定部署环境(例如某个诊所)的数据上能够实现的性能,我们可以判断该模型能否安全部署,或者该模型的性能在具体数据上降低令人无法接受。现有办法根据对未贴标签的测试数据与部署领域相比的预测的可信度来估计。我们发现,现有方法与目前类别不平衡的数据存在矛盾,因为用于校准信心的方法没有考虑到因阶级不平衡造成的偏差,因此无法估计等级准确性。在这里,我们在对不平衡数据集的性能估计框架内引入了等级性校准。具体地说,我们对基于信任的状态模型评价方法,包括温度缩放(TS)、信任差异(DoC)和平均临界信任(ATC)等,进行了估算。我们还扩大了在图像分割中估算Dice相似性系数(DSC)时所使用的方法。我们在四个任务上进行实验,并发现在对不平衡的数据集进行持续调整时,在18项之前的精确性估算方法下,在改进了我们的自然域的精确性估算。

0

相关内容

估计/估计量

估计/估计量

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

原子层组装非晶氧化物半导体柔性TFT及可靠性研究

国家自然科学基金

0+阅读 · 2014年12月31日

NFATc1通过ATF3增强足细胞损伤的机制

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

EAST高功率低杂波与边界等离子体非线性相互作用的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

TMOD1调节actin聚合影响胰岛素信号转导的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

EAST装置上主动控制电阻壁模的数值研究

国家自然科学基金

0+阅读 · 2013年12月31日

激光烧蚀过程中电子热输运的Fokker-Planck与流体模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

能量临界情形的非线性Schrodinger方程

国家自然科学基金

0+阅读 · 2011年12月31日

靶向Bcl2的小分子抗白血病药物的开发研究

国家自然科学基金

0+阅读 · 2009年12月31日

Pan More Gold from the Sand: Refining Open-domain Dialogue Training with Noisy Self-Retrieval Generation

Arxiv

0+阅读 · 2022年9月15日

Estimating Classification Confidence Using Kernel Densities

Arxiv

0+阅读 · 2022年9月15日

Robust Product Classification with Instance-Dependent Noise

Arxiv

0+阅读 · 2022年9月14日

Self-Supervised Domain Calibration and Uncertainty Estimation for Place Recognition

Arxiv

0+阅读 · 2022年9月14日

A Scalable Model Specialization Framework for Training and Inference using Submodels and its Application to Speech Model Personalization

Arxiv

0+阅读 · 2022年9月13日

A Benchmark and a Baseline for Robust Multi-view Depth Estimation

Arxiv

0+阅读 · 2022年9月13日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

Pan More Gold from the Sand: Refining Open-domain Dialogue Training with Noisy Self-Retrieval Generation

Arxiv

0+阅读 · 2022年9月15日

Estimating Classification Confidence Using Kernel Densities

Arxiv

0+阅读 · 2022年9月15日

Robust Product Classification with Instance-Dependent Noise

Arxiv

0+阅读 · 2022年9月14日

Self-Supervised Domain Calibration and Uncertainty Estimation for Place Recognition

Arxiv

0+阅读 · 2022年9月14日

A Scalable Model Specialization Framework for Training and Inference using Submodels and its Application to Speech Model Personalization

Arxiv

0+阅读 · 2022年9月13日

A Benchmark and a Baseline for Robust Multi-view Depth Estimation

Arxiv

0+阅读 · 2022年9月13日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

原子层组装非晶氧化物半导体柔性TFT及可靠性研究

国家自然科学基金

0+阅读 · 2014年12月31日

NFATc1通过ATF3增强足细胞损伤的机制

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

EAST高功率低杂波与边界等离子体非线性相互作用的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

TMOD1调节actin聚合影响胰岛素信号转导的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

EAST装置上主动控制电阻壁模的数值研究

国家自然科学基金

0+阅读 · 2013年12月31日

激光烧蚀过程中电子热输运的Fokker-Planck与流体模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

能量临界情形的非线性Schrodinger方程

国家自然科学基金

0+阅读 · 2011年12月31日

靶向Bcl2的小分子抗白血病药物的开发研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员