Predictive uncertainties in classification tasks are often a consequence of model inadequacy or insufficient training data. In popular applications, such as image processing, we are often required to scrutinise these uncertainties by meaningfully attributing them to input features. This helps to improve interpretability assessments. However, there exist few effective frameworks for this purpose. Vanilla forms of popular methods for the provision of saliency masks, such as SHAP or integrated gradients, adapt poorly to target measures of uncertainty. Thus, state-of-the-art tools instead proceed by creating counterfactual or adversarial feature vectors, and assign attributions by direct comparison to original images. In this paper, we present a novel framework that combines path integrals, counterfactual explanations and generative models, in order to procure attributions that contain few observable artefacts or noise. We evidence that this outperforms existing alternatives through quantitative evaluations with popular benchmarking methods and data sets of varying complexity.
翻译:分类任务的预测不确定性往往是模型不足或培训数据不足的结果。在图像处理等大众应用中,我们往往需要对这些不确定性进行仔细审查,将其有意义地归因于输入特征。这有助于改进可解释性评估。然而,为此目的,很少有有效的框架。香草形式的流行方法,如SHAP或集成梯度等提供突出的遮罩,不适应目标不确定性的测量。因此,最先进的工具通过创建反事实或对抗性特征矢量来进行,并通过直接比较原始图像来分配属性。在本文件中,我们提出了一个新框架,将路径集成、反事实解释和基因化模型结合起来,以便获取含有很少可观察到的人工制品或噪音的属性。我们证明,通过定量评价,采用流行的基准方法和复杂程度各不相同的数据集,这比现有的替代方法要差。