Various saliency map methods have been proposed to interpret and explain predictions of deep learning models. Saliency maps allow us to interpret which parts of the input signals have a strong influence on the prediction results. However, since a saliency map is obtained by complex computations in deep learning models, it is often difficult to know how reliable the saliency map itself is. In this study, we propose a method to quantify the reliability of a salient region in the form of p-values. Our idea is to consider a salient region as a selected hypothesis by the trained deep learning model and employ the selective inference framework. The proposed method can provably control the probability of false positive detections of salient regions. We demonstrate the validity of the proposed method through numerical examples in synthetic and real datasets. Furthermore, we develop a Keras-based framework for conducting the proposed selective inference for a wide class of CNNs without additional implementation cost.
翻译:提出了各种突出的地图方法来解释和解释深层学习模型的预测。清晰的地图使我们可以解释输入信号的哪些部分对预测结果有重大影响。然而,由于深层学习模型的复杂计算得出了一个突出的地图,因此往往很难知道突出的地图本身有多可靠。在这项研究中,我们提出了一个方法,以P值的形式量化突出区域的可靠性。我们的想法是将一个突出的区域视为经过训练的深层学习模型所选定的假设,并采用选择性推论框架。拟议的方法可以准确地控制对突出区域进行虚假正面探测的概率。我们通过合成和实际数据集的数字示例来证明拟议方法的有效性。此外,我们制定了一个以Keras为基础的框架,用于在不增加执行费用的情况下对一大批CNN进行拟议的选择性推断。