应用四轮实验设计对评估可解释的 ML 方法的重要性 (On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods) - 专知论文

会员服务 ·

0

讲稿 · ML · MoDELS · Learning · INFORMS ·

2022 年 6 月 24 日

On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods

翻译：应用四轮实验设计对评估可解释的 ML 方法的重要性

Kasun Amarasinghe,Kit T. Rodolfa,Sérgio Jesus,Valerie Chen,Vladimir Balyan,Pedro Saleiro,Pedro Bizarro,Ameet Talwalkar,Rayid Ghani

Machine Learning (ML) models now inform a wide range of human decisions, but using ``black box'' models carries risks such as relying on spurious correlations or errant data. To address this, researchers have proposed methods for supplementing models with explanations of their predictions. However, robust evaluations of these methods' usefulness in real-world contexts have remained elusive, with experiments tending to rely on simplified settings or proxy tasks. We present an experimental study extending a prior explainable ML evaluation experiment and bringing the setup closer to the deployment setting by relaxing its simplifying assumptions. Our empirical study draws dramatically different conclusions than the prior work, highlighting how seemingly trivial experimental design choices can yield misleading results. Beyond the present experiment, we believe this work holds lessons about the necessity of situating the evaluation of any ML method and choosing appropriate tasks, data, users, and metrics to match the intended deployment contexts.

翻译：机器学习(ML)模型现在为人类广泛的决策提供了信息,但使用“黑匣子”模型会带来风险,如依赖虚假的关联或错误的数据。为了解决这个问题,研究人员提出了补充模型的方法,并解释了他们的预测。然而,对这些方法在现实世界环境中的有用性进行强有力的评价仍然难以实现,实验往往依赖于简化设置或代理任务。我们提出了一个实验性研究,将先前可以解释的 ML 评估实验扩展至更接近部署环境,放松简化的假设。我们的经验性研究得出了与先前工作截然不同的结论,突出了似乎微不足道的实验设计选择如何产生误导的结果。我们认为,除了目前的实验之外,这项工作还总结出有必要确定任何ML 方法的评价地点,并选择适当的任务、数据、用户和衡量标准,以适应预期部署环境。

0

相关内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

QCD相变及若干强子性质的研究

国家自然科学基金

0+阅读 · 2014年12月31日

掺杂氧化铈固体电解质相变及其对固体氧化物燃料电池性能的影响

国家自然科学基金

0+阅读 · 2014年12月31日

Klotho对AD神经血管单元的调控机制及川芎苯酞类化合物的干预作用

国家自然科学基金

0+阅读 · 2014年12月31日

用于糖化血红蛋白检测的新型电化学生物传感器研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ipr1基因介导巨噬细胞凋亡的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

从Wnt/β-catenin信号通路探讨益气养血方干预骨关节炎软骨退变的机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于T细胞介导炎性机制探讨中药注射剂的类过敏反应机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于信道Time/Power度量指标的TOA测距误差模型及其应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

去酰基化ghrelin改善脂肪组织炎症所致胰岛素抵抗的机制- - 调节性T细胞的作用

国家自然科学基金

0+阅读 · 2011年12月31日

温针干预兔软骨退变的信号转导机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Connecting Algorithmic Research and Usage Contexts: A Perspective of Contextualized Evaluation for Explainable AI

Arxiv

0+阅读 · 2022年8月17日

Fine-Grained Complexity Lower Bounds for Families of Dynamic Graphs

Arxiv

0+阅读 · 2022年8月16日

An Empirical Comparison of Explainable Artificial Intelligence Methods for Clinical Data: A Case Study on Traumatic Brain Injury

Arxiv

0+阅读 · 2022年8月13日

Explainable Predictive Process Monitoring: Evaluation Metrics and Guidelines for Process Outcome Prediction

Arxiv

0+阅读 · 2022年8月13日

The Weighting Game: Evaluating Quality of Explainability Methods

Arxiv

0+阅读 · 2022年8月12日

Explainable Deep Learning Methods in Medical Diagnosis: A Survey

Arxiv

35+阅读 · 2022年5月10日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Arxiv

17+阅读 · 2022年2月17日

A Survey on the Explainability of Supervised Machine Learning

Arxiv

24+阅读 · 2020年11月16日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

Graph Neural Networks: A Review of Methods and Applications

Graph Neural Networks: A Review of Methods and Applications

Arxiv

75+阅读 · 2018年12月20日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

Connecting Algorithmic Research and Usage Contexts: A Perspective of Contextualized Evaluation for Explainable AI

Arxiv

0+阅读 · 2022年8月17日

Fine-Grained Complexity Lower Bounds for Families of Dynamic Graphs

Arxiv

0+阅读 · 2022年8月16日

An Empirical Comparison of Explainable Artificial Intelligence Methods for Clinical Data: A Case Study on Traumatic Brain Injury

Arxiv

0+阅读 · 2022年8月13日

Explainable Predictive Process Monitoring: Evaluation Metrics and Guidelines for Process Outcome Prediction

Arxiv

0+阅读 · 2022年8月13日

The Weighting Game: Evaluating Quality of Explainability Methods

Arxiv

0+阅读 · 2022年8月12日

Explainable Deep Learning Methods in Medical Diagnosis: A Survey

Arxiv

35+阅读 · 2022年5月10日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Arxiv

17+阅读 · 2022年2月17日

A Survey on the Explainability of Supervised Machine Learning

Arxiv

24+阅读 · 2020年11月16日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

Graph Neural Networks: A Review of Methods and Applications

Graph Neural Networks: A Review of Methods and Applications

Arxiv

75+阅读 · 2018年12月20日

相关基金

QCD相变及若干强子性质的研究

国家自然科学基金

0+阅读 · 2014年12月31日

掺杂氧化铈固体电解质相变及其对固体氧化物燃料电池性能的影响

国家自然科学基金

0+阅读 · 2014年12月31日

Klotho对AD神经血管单元的调控机制及川芎苯酞类化合物的干预作用

国家自然科学基金

0+阅读 · 2014年12月31日

用于糖化血红蛋白检测的新型电化学生物传感器研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ipr1基因介导巨噬细胞凋亡的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

从Wnt/β-catenin信号通路探讨益气养血方干预骨关节炎软骨退变的机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于T细胞介导炎性机制探讨中药注射剂的类过敏反应机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于信道Time/Power度量指标的TOA测距误差模型及其应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

去酰基化ghrelin改善脂肪组织炎症所致胰岛素抵抗的机制- - 调节性T细胞的作用

国家自然科学基金

0+阅读 · 2011年12月31日

温针干预兔软骨退变的信号转导机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员