深模型一致的反事实 (Consistent Counterfactuals for Deep Models)

Counterfactual examples are one of the most commonly-cited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis. Counterfactuals are often discussed under the assumption that the model on which they will be used is static, but in deployment models may be periodically retrained or fine-tuned. This paper studies the consistency of model prediction on counterfactual examples in deep networks under small changes to initial training conditions, such as weight initialization and leave-one-out variations in data, as often occurs during model deployment. We demonstrate experimentally that counterfactual examples for deep models are often inconsistent across such small changes, and that increasing the cost of the counterfactual, a stability-enhancing mitigation suggested by prior work in the context of simpler models, is not a reliable heuristic in deep networks. Rather, our analysis shows that a model's local Lipschitz continuity around the counterfactual is key to its consistency across related models. To this end, we propose Stable Neighbor Search as a way to generate more consistent counterfactual explanations, and illustrate the effectiveness of this approach on several benchmark datasets.

翻译：反事实实例是解释融资和医学诊断等关键领域机器学习模型预测的最常见方法之一。反事实往往在以下假设下讨论:将使用这些模型的模式是静态的,但在部署模型中可以定期重新培训或微调。本文研究了在初始培训条件变化小的情况下深网络反事实实例模型预测的一致性,这些变化在模型部署期间经常发生。我们实验性地表明,深层次模型的反事实实例在这种小变化中往往不一致,而提高反事实成本,即先前在更简单模型中开展的工作所建议的增强稳定性,在深层网络中不是一个可靠的超常现象。相反,我们的分析表明,模型的局部Lipschitz在初始培训条件下的连续性是其在有关模型之间一致性的关键。为此,我们建议Stagl Neighbor搜索,作为产生更一致的反事实解释的一种方法,并表明这一方法在几个基准数据集上的有效性。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

因果推断，Causal Inference：The Mixtape

专知会员服务

107+阅读 · 2021年8月27日

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知会员服务

87+阅读 · 2020年8月28日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日