Since neural networks play an increasingly important role in critical sectors, explaining network predictions has become a key research topic. Counterfactual explanations can help to understand why classifier models decide for particular class assignments and, moreover, how the respective input samples would have to be modified such that the class prediction changes. Previous approaches mainly focus on image and tabular data. In this work we propose SPARCE, a generative adversarial network (GAN) architecture that generates SPARse Counterfactual Explanations for multivariate time series. Our approach provides a custom sparsity layer and regularizes the counterfactual loss function in terms of similarity, sparsity, and smoothness of trajectories. We evaluate our approach on real-world human motion datasets as well as a synthetic time series interpretability benchmark. Although we make significantly sparser modifications than other approaches, we achieve comparable or better performance on all metrics. Moreover, we demonstrate that our approach predominantly modifies salient time steps and features, leaving non-salient inputs untouched.
翻译:由于神经网络在关键部门发挥着越来越重要的作用,解释网络预测已成为一个关键的研究专题。反事实解释有助于理解为什么分类模型决定特定类别任务,以及如何修改各自的输入样本,以便进行类预测变化。以前的方法主要侧重于图像和表格数据。在这项工作中,我们提议SPARCE, 一种可生成多变时间序列 SPARS 反事实解释的基因对抗网络(GAN)架构。我们的方法提供了一个定制的宽度层,规范了反事实损失功能,即相似性、宽度和轨迹平滑性。我们评估了我们对于现实世界人类运动数据集和合成时间序列解释基准的方法。虽然我们比其他方法作的更小得多的修改,但我们在所有指标上都实现了可比或更好的表现。此外,我们证明我们的方法主要是调整了显著的时间步骤和特征,使得非适应性的投入没有被触及。