Many existing methods of counterfactual explanations ignore the intrinsic relationships between data attributes and thus fail to generate realistic counterfactuals. Moreover, the existing methods that account for relationships between data attributes require domain knowledge, which limits their applicability in complex real-world applications. In this paper, we propose a novel approach to realistic counterfactual explanations that preserve relationships between data attributes. The model directly learns the relationships by a variational auto-encoder without domain knowledge and then learns to disturb the latent space accordingly. We conduct extensive experiments on both synthetic and real-world datasets. The results demonstrate that the proposed method learns relationships from the data and preserves these relationships in generated counterfactuals.
翻译:现有的许多反事实解释方法忽视了数据属性之间的内在关系,因而未能产生现实的反事实。此外,目前用于计算数据属性之间关系的方法需要领域知识,这限制了其在复杂的现实世界应用中的适用性。在本文中,我们提出了一种新颖的方法,以现实的反事实解释来维护数据属性之间的关系。模型直接通过没有域知识的变式自动编码器来学习关系,然后学会相应地扰动潜伏空间。我们在合成和真实世界数据集上进行了广泛的实验。结果显示,拟议方法从数据中学习关系,并在产生的反事实中保留这些关系。