超越自我注意: 外部注意使用两线图层进行视觉任务 (Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks)

Attention mechanisms, especially self-attention, play an increasingly important role in deep feature representation in visual tasks. Self-attention updates the feature at each position by computing a weighted sum of features using pair-wise affinities across all positions to capture long-range dependency within a single sample. However, self-attention has a quadratic complexity and ignores potential correlation between different samples. This paper proposes a novel attention mechanism which we call external attention, based on two external, small, learnable, and shared memories, which can be implemented easily by simply using two cascaded linear layers and two normalization layers; it conveniently replaces self-attention in existing popular architectures. External attention has linear complexity and implicitly considers the correlations between all samples. Extensive experiments on image classification, semantic segmentation, image generation, point cloud classification and point cloud segmentation tasks reveal that our method provides comparable or superior performance to the self-attention mechanism and some of its variants, with much lower computational and memory costs.

翻译：关注机制,特别是自我关注机制,在视觉任务中的深度特征代表中发挥着越来越重要的作用。自我关注更新了每个位置的特征,利用所有位置的双向近似性来计算一个加权的特征总和,以便在单一样本中捕捉长距离依赖性。然而,自我关注具有四重复杂性,忽视了不同样本之间的潜在关联性。本文件建议了一个新的关注机制,我们根据两个外部的、小的、可学习的和共享的记忆来呼吁外部关注,这种机制可以通过简单地使用两个级联的线性层和两个正常化层来轻松实施;它方便地取代现有流行结构中的自我关注。外部关注具有线性复杂性,并隐含考虑所有样本之间的相互关系。关于图像分类、语义分割、图像生成、点云分解和点云分化任务的广泛实验表明,我们的方法为自留机制及其一些变体提供了可比或优性能,其计算和记忆成本要低得多。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

【ICLR2021】常识人工智能，77页ppt

专知会员服务

78+阅读 · 2021年5月11日

【AAAI2020】基于属性指导和纯视觉的注意力对齐的小样本识别

专知会员服务

15+阅读 · 2021年1月14日

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

知识驱动的视觉知识学习，以VQA视觉问答为例，31页ppt

专知会员服务

36+阅读 · 2020年9月25日