Counterfactuals have been established as a popular explainability technique which leverages a set of minimal edits to alter the prediction of a classifier. When considering conceptual counterfactuals on images, the edits requested should correspond to salient concepts present in the input data. At the same time, conceptual distances are defined by knowledge graphs, ensuring the optimality of conceptual edits. In this work, we extend previous endeavors on graph edits as counterfactual explanations by conducting a comparative study which encompasses both supervised and unsupervised Graph Neural Network (GNN) approaches. To this end, we pose the following significant research question: should we represent input data as graphs, which is the optimal GNN approach in terms of performance and time efficiency to generate minimal and meaningful counterfactual explanations for black-box image classifiers?
翻译:暂无翻译