解释黑色Box预测算术的行为与业余学习 (Explaining the Behavior of Black-Box Prediction Algorithms with Causal Learning)

We propose to explain the behavior of black-box prediction methods (e.g., deep neural networks trained on image pixel data) using causal graphical models. Specifically, we explore learning the structure of a causal graph where the nodes represent prediction outcomes along with a set of macro-level "interpretable" features, while allowing for arbitrary unmeasured confounding among these variables. The resulting graph may indicate which of the interpretable features, if any, are possible causes of the prediction outcome and which may be merely associated with prediction outcomes due to confounding. The approach is motivated by a counterfactual theory of causal explanation wherein good explanations point to factors that are "difference-makers" in an interventionist sense. The resulting analysis may be useful in algorithm auditing and evaluation, by identifying features which make a causal difference to the algorithm's output.

翻译：我们建议用因果图形模型解释黑盒预测方法(例如,在图像像素数据方面受过训练的深神经网络)的行为。具体地说, 我们探索一个因果图表的结构, 节点代表预测结果, 以及一组宏观级的“ 解释” 特征, 同时允许任意的、无法测量的混杂这些变量。由此得出的图表可能显示哪些可解释的特征( 如果有的话)是预测结果的可能原因, 并且可能仅仅与预测结果相关, 因为混乱。这种方法的动机是反事实的因果解释理论, 其中良好的解释指向干预主义意义上的“ 差异制造者 ” 。由此产生的分析可能对算法审计和评价有用, 其方法是确定对算法输出产生因果关系的特征。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

专知会员服务

39+阅读 · 2020年11月3日

【开放书】贝叶斯推理与机器学习，690页pdf，Bayesian Reasoning and Machine Learning

专知会员服务

191+阅读 · 2020年5月30日

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

因果图，Causal Graphs，52页ppt