An important step towards explaining deep image classifiers lies in the identification of image regions that contribute to individual class scores in the model's output. However, doing this accurately is a difficult task due to the black-box nature of such networks. Most existing approaches find such attributions either using activations and gradients or by repeatedly perturbing the input. We instead address this challenge by training a second deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum. These attributions are in the form of masks that only show the classifier-relevant parts of an image, masking out the rest. Our approach produces sharper and more boundary-precise masks when compared to the saliency maps generated by other methods. Moreover, unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks. Finally, the proposed method is very efficient for inference since it only takes a single forward pass through the Explainer to generate all class-specific masks. We show that our attributions are superior to established methods both visually and quantitatively, by evaluating them on the PASCAL VOC-2007 and Microsoft COCO-2014 datasets.
翻译:解释深层图像分类的一个重要步骤是确定有助于模型输出中单级分数的图像区域。 但是, 准确地做到这一点是一项困难的任务, 原因是这种网络的黑盒性质。 大多数现有办法发现这种属性, 要么是使用激活和梯度, 要么是反复扰动输入。 相反, 我们通过培训第二个深层网络, 解释器来应对这一挑战, 以预测预先训练的黑盒分类( Explanandum) 的属性。 这些属性的形式是面罩, 只显示图像中与分类有关的部分, 遮盖其余部分。 我们的方法与其他方法生成的突出地图相比, 会产生更清晰、 更精准的边界拼写面罩。 此外, 与大多数现有办法不同, 我们的方法能够直接生成非常独特的类别口罩。 最后, 拟议的方法非常有效, 因为它只通过解析器一次前方传递所有特定类口罩。 我们表明, 我们的属性优于既有的视觉和定量方法, 通过在 PASAL VOC- 2007 和 Microsoft CO- 2014 数据中评估它们。