从传闻证据到定量评估方法:系统审查评估可解释的大赦国际 (From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI) - 专知论文

会员服务 ·

0

XAI · 讲稿 · Extensibility · 优化器 · 可辨认的 ·

2022 年 5 月 31 日

From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI

翻译：从传闻证据到定量评估方法:系统审查评估可解释的大赦国际

Meike Nauta,Jan Trienes,Shreyasi Pathak,Elisa Nguyen,Michelle Peters,Yasmin Schmitt,Jörg Schlötterer,Maurice van Keulen,Christin Seifert

from arxiv, Link to website added: https://utwente-dmb.github.io/xai-papers/

The rising popularity of explainable artificial intelligence (XAI) to understand high-performing black boxes, also raised the question of how to evaluate explanations of machine learning (ML) models. While interpretability and explainability are often presented as a subjectively validated binary property, we consider it a multi-faceted concept. We identify 12 conceptual properties, such as Compactness and Correctness, that should be evaluated for comprehensively assessing the quality of an explanation. Our so-called Co-12 properties serve as categorization scheme for systematically reviewing the evaluation practice of more than 300 papers published in the last 7 years at major AI and ML conferences that introduce an XAI method. We find that 1 in 3 papers evaluate exclusively with anecdotal evidence, and 1 in 5 papers evaluate with users. We also contribute to the call for objective, quantifiable evaluation methods by presenting an extensive overview of quantitative XAI evaluation methods. This systematic collection of evaluation methods provides researchers and practitioners with concrete tools to thoroughly validate, benchmark and compare new and existing XAI methods. This also opens up opportunities to include quantitative metrics as optimization criteria during model training in order to optimize for accuracy and interpretability simultaneously.

翻译：可以解释的人工智能(XAI)越来越受欢迎,以理解高性能黑盒,这也提出了如何评价机器学习模型的解释的问题。虽然可解释性和可解释性常常作为主观验证的二元财产提出,但我们认为这是一个多面概念。我们确定了12个概念属性,如Clatical and correctity,为全面评估解释质量而应加以评估。我们所谓的Co-12属性是系统审查过去7年在采用XAI方法的主要AI和ML会议上发表的300多份文件的评价做法的分类办法。我们发现,每3份文件中就有1份文件只用传闻证据进行评价,每5份文件中就有1份文件与用户一道进行评价。我们还通过广泛概述XAI的定量评价方法,帮助呼吁采用客观、可量化的评价方法。这种系统收集的评价方法为研究人员和从业人员提供了具体工具,以便彻底验证、基准和比较新的和现有的XAI方法。这也为在示范培训期间将量化指标作为优化标准提供了机会,以便同时优化准确性和可解释性提供了。

0

相关内容

XAI

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

[每周ArXiv] 最新几篇GNN论文

[每周ArXiv] 最新几篇GNN论文

图与推荐

0+阅读 · 2021年5月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ARVCF调节cadherin/catenin复合体介导的细胞间黏附的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

QCD相变及若干强子性质的研究

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Notch信号通路参与家蚕胚胎发育分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于开径式非相干宽带腔增强吸收光谱大气中HONO探测方法的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Msi2基因调控与Wnt/β-catenin和Notch信号通路间串话在CD44＋肝癌干细胞“干性”维持中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

发光二极管LED非相干宽带腔增强吸收光谱技术对大气HONO的定量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

突发事件下通信网络可靠性及资源配置应急策略

国家自然科学基金

2+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

驴Cathelicidin EA-CATH1的结构与功能研究及分子设计

国家自然科学基金

0+阅读 · 2009年12月31日

A Framework for Auditing Multilevel Models using Explainability Methods

A Framework for Auditing Multilevel Models using Explainability Methods

Arxiv

0+阅读 · 2022年7月15日

An original model for multi-target learning of logical rules for knowledge graph reasoning

Arxiv

0+阅读 · 2022年7月15日

Explainable Deep Learning Methods in Medical Diagnosis: A Survey

Arxiv

35+阅读 · 2022年5月10日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Arxiv

17+阅读 · 2022年2月17日

Trustworthy AI: From Principles to Practices

Arxiv

46+阅读 · 2021年10月4日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Explainable Recommendation: A Survey and New Perspectives

Explainable Recommendation: A Survey and New Perspectives

Arxiv

66+阅读 · 2019年8月15日

Graph Neural Networks: A Review of Methods and Applications

Graph Neural Networks: A Review of Methods and Applications

Arxiv

75+阅读 · 2018年12月20日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

[每周ArXiv] 最新几篇GNN论文

[每周ArXiv] 最新几篇GNN论文

图与推荐

0+阅读 · 2021年5月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Framework for Auditing Multilevel Models using Explainability Methods

A Framework for Auditing Multilevel Models using Explainability Methods

Arxiv

0+阅读 · 2022年7月15日

An original model for multi-target learning of logical rules for knowledge graph reasoning

Arxiv

0+阅读 · 2022年7月15日

Explainable Deep Learning Methods in Medical Diagnosis: A Survey

Arxiv

35+阅读 · 2022年5月10日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Arxiv

17+阅读 · 2022年2月17日

Trustworthy AI: From Principles to Practices

Arxiv

46+阅读 · 2021年10月4日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Explainable Recommendation: A Survey and New Perspectives

Explainable Recommendation: A Survey and New Perspectives

Arxiv

66+阅读 · 2019年8月15日

Graph Neural Networks: A Review of Methods and Applications

Graph Neural Networks: A Review of Methods and Applications

Arxiv

75+阅读 · 2018年12月20日

相关基金

ARVCF调节cadherin/catenin复合体介导的细胞间黏附的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

QCD相变及若干强子性质的研究

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Notch信号通路参与家蚕胚胎发育分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于开径式非相干宽带腔增强吸收光谱大气中HONO探测方法的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Msi2基因调控与Wnt/β-catenin和Notch信号通路间串话在CD44＋肝癌干细胞“干性”维持中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

发光二极管LED非相干宽带腔增强吸收光谱技术对大气HONO的定量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

突发事件下通信网络可靠性及资源配置应急策略

国家自然科学基金

2+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

驴Cathelicidin EA-CATH1的结构与功能研究及分子设计

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员