This paper analyzes the predictions of image captioning models with attention mechanisms beyond visualizing the attention itself. We develop variants of layer-wise relevance propagation (LRP) and gradient-based explanation methods, tailored to image captioning models with attention mechanisms. We compare the interpretability of attention heatmaps systematically against the explanations provided by explanation methods such as LRP, Grad-CAM, and Guided Grad-CAM. We show that explanation methods provide simultaneously pixel-wise image explanations (supporting and opposing pixels of the input image) and linguistic explanations (supporting and opposing words of the preceding sequence) for each word in the predicted captions. We demonstrate with extensive experiments that explanation methods 1) can reveal additional evidence used by the model to make decisions compared to attention; 2) correlate to object locations with high precision; 3) are helpful to "debug" the model, e.g. by analyzing the reasons for hallucinated object words. With the observed properties of explanations, we further design an LRP-inference fine-tuning strategy that reduces the issue of object hallucination in image captioning models, and meanwhile, maintains the sentence fluency. We conduct experiments with two widely used attention mechanisms: the adaptive attention mechanism calculated with the additive attention and the multi-head attention mechanism calculated with the scaled dot product.
翻译:本文分析了图像字幕模型的预测,其关注机制超出了关注本身的可视化机制。我们开发了分层相关性传播和梯度解释方法的变体,这些变体是按图像字幕模型和关注机制定制的。我们系统地将关注热图的可解释性与解释方法(如LRP、Grad-CAM和Drook Grad-CAM)提供的解释加以比较。我们表明,解释方法既提供了像素图像解释(支持和反对输入图像的像素),也提供了语言解释(支持和反对前一序列的词),用于预测标题中的每个词。我们通过广泛的实验证明,解释方法1 能够揭示模型用来做出与关注相比的决定的额外证据;2 与高精确的物体位置相关;3 有助于“调试算”模型,例如分析致幻物体单词的原因。我们用观察的特性,进一步设计了一个LRP-推导微调战略,以减少图像描述模型中的物体幻觉问题,同时,维持模型所使用的调整性判分辨机制。我们用两种调整式的注意机制进行了调整性调整后测算的注意。我们用了计算的调整性调整后,用了计算的注意机制。我们用了计算了两个调整性调整式的注意机制。我们用了计算了计算了调整式的注意。