Attention mechanisms is frequently used to learn the discriminative features for better feature representations. In this paper, we extend the attention mechanism to the task of weakly supervised object localization (WSOL) and propose the dual-attention guided dropblock module (DGDM), which aims at learning the informative and complementary visual patterns for WSOL. This module contains two key components, the channel attention guided dropout (CAGD) and the spatial attention guided dropblock (SAGD). To model channel interdependencies, the CAGD ranks the channel attentions and treats the top-k attentions with the largest magnitudes as the important ones. It also keeps some low-valued elements to increase their value if they become important during training. The SAGD can efficiently remove the most discriminative information by erasing the contiguous regions of feature maps rather than individual pixels. This guides the model to capture the less discriminative parts for classification. Furthermore, it can also distinguish the foreground objects from the background regions to alleviate the attention misdirection. Experimental results demonstrate that the proposed method achieves new state-of-the-art localization performance.
翻译:在本文中,我们把注意力扩大到监督不力的物体定位(WSOL)的任务,并提出双注意制式投放块模块(DGDM),该模块旨在学习WSOL的知情和互补视觉模式。该模块包含两个关键组成部分,即频道关注制丢弃(CAGD)和空间关注制式投放块(SAGD)。为了模拟传递相互依存关系,CAGD将频道的注意力排在前列,将最大程度的注意力排在前列,将最重要的注意力排在前列。它也保留一些低价值元素,以便在培训中变得重要时提高它们的价值。SAGD能够通过删除地貌图毗连区域而不是单个像素来有效消除最具有歧视性的信息。该模块指导模型以捕捉不那么具有歧视性的分类部分。此外,它还可以区分背景区域中的地表物体,以缓解注意力的偏差。实验结果表明,拟议的方法取得了新的艺术定位性。