Channel and spatial attention mechanism has proven to provide an evident performance boost of deep convolution neural networks (CNNs). Most existing methods focus on one or run them parallel (series), neglecting the collaboration between the two attentions. In order to better establish the feature interaction between the two types of attention, we propose a plug-and-play attention module, which we term "CAT"-activating the Collaboration between spatial and channel Attentions based on learned Traits. Specifically, we represent traits as trainable coefficients (i.e., colla-factors) to adaptively combine contributions of different attention modules to fit different image hierarchies and tasks better. Moreover, we propose the global entropy pooling (GEP) apart from global average pooling (GAP) and global maximum pooling (GMP) operators, an effective component in suppressing noise signals by measuring the information disorder of feature maps. We introduce a three-way pooling operation into attention modules and apply the adaptive mechanism to fuse their outcomes. Extensive experiments on MS COCO, Pascal-VOC, Cifar-100, and ImageNet show that our CAT outperforms existing state-of-the-art attention mechanisms in object detection, instance segmentation, and image classification. The model and code will be released soon.
翻译:事实证明,频道和空间关注机制可以明显地促进深卷神经网络(CNNs)的性能。大多数现有方法侧重于一个或平行运行不同的神经网络(系列),忽视了两个关注点之间的协作。为了更好地建立两种关注类型之间的特征互动,我们建议了一个插插和播放关注模块,我们称之为“CAT”,以根据已学的轨迹启动空间和频道关注点之间的协作。具体地说,我们代表各种特征,作为可培训系数(即,科拉因素),适应性地将不同关注模块的贡献结合起来,以更好地适应不同的图像等级和任务。此外,我们提议,除全球平均集合和全球最大集合操作器操作器之外,全球英特普集合库(GEP)是一个有效的组成部分,通过测量地貌地图的信息混乱来抑制噪音信号。我们引入了三路集合操作,并将适应机制用于整合其结果。关于MS COCO、 帕斯卡尔-VOC、 Cifar-100 和图像网络的广泛实验显示,我们的CTOP(CAT)将很快显示我们的现有图像分级的探测和图像分级系统将显示,现有状态分级的分解和分解。