Nowadays, deep vision models are being widely deployed in safety-critical applications, e.g., autonomous driving, and explainability of such models is becoming a pressing concern. Among explanation methods, counterfactual explanations aim to find minimal and interpretable changes to the input image that would also change the output of the model to be explained. Such explanations point end-users at the main factors that impact the decision of the model. However, previous methods struggle to explain decision models trained on images with many objects, e.g., urban scenes, which are more difficult to work with but also arguably more critical to explain. In this work, we propose to tackle this issue with an object-centric framework for counterfactual explanation generation. Our method, inspired by recent generative modeling works, encodes the query image into a latent space that is structured in a way to ease object-level manipulations. Doing so, it provides the end-user with control over which search directions (e.g., spatial displacement of objects, style modification, etc.) are to be explored during the counterfactual generation. We conduct a set of experiments on counterfactual explanation benchmarks for driving scenes, and we show that our method can be adapted beyond classification, e.g., to explain semantic segmentation models. To complete our analysis, we design and run a user study that measures the usefulness of counterfactual explanations in understanding a decision model. Code is available at https://github.com/valeoai/OCTET.
翻译:现在,深度视觉模型正在广泛部署在关键安全领域,例如自动驾驶,并且对这些模型的可解释性成为一个紧迫的问题。在解释方式中,反事实解释旨在找到最小而可解释的更改,使输入图像也会改变到要解释的模型的输出。这样的解释指向最主要的影响模型决策的因素。然而,以前的方法难以解释训练在包含许多物体的图像上的决策模型,例如,城市场景,这些场景更难以处理,但也更加关键重要。在这项工作中,我们提出了一种基于物体的反事实解释生成框架来解决这个问题。我们的方法受到最近的生成建模方法的启发,将查询图像编码成一个结构化的潜在空间,以便容易进行物体级的操作。这样做,为最终用户提供了对要在反事实生成期间探索哪些搜索方向(例如,物体的空间位移,风格修改等)的控制。我们在驾驶场景的反事实解释基准上进行了一系列实验,并展示了我们的方法可以被应用于分类之外的领域,例如解释语义分割模型。为了完成我们的分析,我们设计并进行了一个用户研究,测量反事实解释在理解决策模型方面的有用性。代码可在 https://github.com/valeoai/OCTET 找到。