There has been a growing interest in capturing and maintaining causal relationships in Neural Network (NN) models in recent years. We study causal approaches to estimate and maintain input-output attributions in NN models in this work. In particular, existing efforts in this direction assume independence among input variables (by virtue of the NN architecture), and hence study only direct causal effects. Viewing an NN as a structural causal model (SCM), we instead focus on going beyond direct effects, introduce edges among input features, and provide a simple yet effective methodology to capture and maintain direct and indirect causal effects while training an NN model. We also propose effective approximation strategies to quantify causal attributions in high dimensional data. Our wide range of experiments on synthetic and real-world datasets show that the proposed ante-hoc method learns causal attributions for both direct and indirect causal effects close to the ground truth effects.
翻译:----
在最近的几年中,捕获和维护神经网络(NN)模型中的因果关系引起了越来越多的关注。我们在本研究中研究因果方法来估计和维护NN模型中的输入-输出归因。特别地,现有的工作在这个方向上假设输入变量之间独立(通过NN架构),因此只研究直接因果效应。将NN视为结构因果模型(SCM),我们着眼于超越直接效应,引入输入特征之间的边,并提供一种简单而有效的方法来捕捉和维护直接和间接因果效应,同时训练NN模型。我们还提出了有效的近似策略来量化高维数据的因果归因。我们在合成和真实世界的数据集上进行了广泛的实验,结果表明所提出的预先方法可以学习到接近基本真实效应的直接和间接因果归因。