展望模型的一致性 (ATCON: Attention Consistency for Vision Models)

Attention--or attribution--maps methods are methods designed to highlight regions of the model's input that were discriminative for its predictions. However, different attention maps methods can highlight different regions of the input, with sometimes contradictory explanations for a prediction. This effect is exacerbated when the training set is small. This indicates that either the model learned incorrect representations or that the attention maps methods did not accurately estimate the model's representations. We propose an unsupervised fine-tuning method that optimizes the consistency of attention maps and show that it improves both classification performance and the quality of attention maps. We propose an implementation for two state-of-the-art attention computation methods, Grad-CAM and Guided Backpropagation, which relies on an input masking technique. We also show results on Grad-CAM and Integrated Gradients in an ablation study. We evaluate this method on our own dataset of event detection in continuous video recordings of hospital patients aggregated and curated for this work. As a sanity check, we also evaluate the proposed method on PASCAL VOC and SVHN. With the proposed method, with small training sets, we achieve a 6.6 points lift of F1 score over the baselines on our video dataset, a 2.9 point lift of F1 score on PASCAL, and a 1.8 points lift of mean Intersection over Union over Grad-CAM for weakly supervised detection on PASCAL. Those improved attention maps may help clinicians better understand vision model predictions and ease the deployment of machine learning systems into clinical care. We share part of the code for this article at the following repository: https://github.com/alimirzazadeh/SemisupervisedAttention.

翻译：关注度- 或归因- 映射方法是用来突出模型投入中对其预测具有歧视性的区域的方法。然而,不同的关注度地图方法可以突出输入的不同区域,有时对预测作出相互矛盾的解释。当培训组规模小时,这种效果会加剧。这表明模型学到了错误的表述方式,或者关注度地图方法没有准确估计模型的表示方式。我们建议了一种不受监督的微调方法,以优化关注度地图的一致性,并表明它既提高了分类性能,也改善了关注地图的质量。我们建议采用两种最先进的关注计算方法,即Grad-CAM和向后调整法,这需要借助一种信息掩码技术。我们还展示了Grad- CAM和综合梯度的结果,这说明模型没有准确估计模型的模型。我们建议了一种在医院病人连续的视频记录中为这项工作汇总和缩略图中,作为精神检查,我们还评估了PASCAL VOC和SVHN的拟议方法。我们建议采用一种更好的帮助度计算方法,即SAL- 系统在SAL Serview Seral Serveal Serveal Sup 上,我们用了一个比A Areval Servial Serveal sq sq sq sal sq sq sal sreald supal supal sq sach 。我们用了一个比数,我们用了一个比数。