黑箱模型学习补充解释时的考虑 (Considerations When Learning Additive Explanations for Black-Box Models)

from arxiv, Previously titled "Learning Global Additive Explanations for Neural Nets Using Model Distillation". A short version was presented at NeurIPS 2018 Machine Learning for Health Workshop

Many methods to explain black-box models, whether local or global, are additive. In this paper, we study global additive explanations for non-additive models, focusing on four explanation methods: partial dependence, Shapley explanations adapted to a global setting, distilled additive explanations, and gradient-based explanations. We show that different explanation methods characterize non-additive components in a black-box model's prediction function in different ways. We use the concepts of main and total effects to anchor additive explanations, and quantitatively evaluate additive and non-additive explanations. Even though distilled explanations are generally the most accurate additive explanations, non-additive explanations such as tree explanations that explicitly model non-additive components tend to be even more accurate. Despite this, our user study showed that machine learning practitioners were better able to leverage additive explanations for various tasks. These considerations should be taken into account when considering which explanation to trust and use to explain black-box models.

翻译：解释黑盒模型的许多方法, 不管是本地还是全球的, 都是一种添加。在本文中, 我们研究非添加型模型的全球性添加剂解释, 侧重于四种解释方法: 部分依赖性、适合全球背景的沙皮解释、蒸馏添加剂解释和梯度解释。我们显示, 黑盒模型的预测函数中, 不同的解释方法具有非添加成分的特点。我们用主要和总影响的概念来锁定添加剂解释, 并量化地评估添加剂和非添加剂解释。尽管蒸馏式解释通常是最准确的添加剂解释, 非添加剂解释, 如明确模拟非添加剂组成部分的树解释往往甚至更准确。尽管如此, 我们用户研究表明, 机器学习实践者更有能力为各种任务利用添加剂解释。在考虑解释信任和使用哪种解释来解释黑盒模型时, 应该考虑这些因素。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

专知会员服务

45+阅读 · 2020年9月25日

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日