注意:用NNN式的注意机制为CNN系统提供设备 (Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms)

In NLP, convolutional neural networks (CNNs) have benefited less than recurrent neural networks (RNNs) from attention mechanisms. We hypothesize that this is because the attention in CNNs has been mainly implemented as attentive pooling (i.e., it is applied to pooling) rather than as attentive convolution (i.e., it is integrated into convolution). Convolution is the differentiator of CNNs in that it can powerfully model the higher-level representation of a word by taking into account its local fixed-size context in the input text t^x. In this work, we propose an attentive convolution network, ATTCONV. It extends the context scope of the convolution operation, deriving higher-level features for a word not only from local context, but also information extracted from nonlocal context by the attention mechanism commonly used in RNNs. This nonlocal context can come (i) from parts of the input text t^x that are distant or (ii) from extra (i.e., external) contexts t^y. Experiments on sentence modeling with zero-context (sentiment analysis), single-context (textual entailment) and multiple-context (claim verification) demonstrate the effectiveness of ATTCONV in sentence representation learning with the incorporation of context. In particular, attentive convolution outperforms attentive pooling and is a strong competitor to popular attentive RNNs.

翻译：在国家实验室中,神经神经网络(CNN)比经常性神经网络(RNNNs)从关注机制中获益较少。我们推测,这是因为CNN的注意力主要作为专注集合(即用于集合)而不是作为专注组合(即,它被整合到融合中)来实施,因此CNN的神经网络(NLP)的受益程度低于经常性神经网络(RNNNs)的受益程度,因为在输入文本 tx中,它能够强有力地模拟一个词的更高层次的表达方式,考虑到它的地方固定大小。在这项工作中,我们建议建立一个专注的神经网络(ATTCONV ) 。它扩大了CONV行动的背景范围,不仅从当地情况中生成了一个词的更高层次的特征,而且还从非当地情况中生成了在RNNUS通常使用的注意机制中常用的信息。这种非本地环境可以(i)从输入文本tx的部分来模拟一个词的高级或(ii),从额外的(即外部)背景。我们提议建立一个专注网络网络。它扩展了合并操作范围,不仅从一个字面面面面的句中产生一个内容,还显示反复解读的合并(CON的合并(分析),还显示单一的合并的合并的合并的合并的合并。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

基于上下文化图注意力网络的知识图谱的条目推荐，Contextualized Graph Attention Network for Recommendation with Item Knowledge Graph

专知会员服务

101+阅读 · 2020年6月28日

【ICML2020-华为港科大】RNN和LSTM有长期记忆吗？

专知会员服务

78+阅读 · 2020年6月25日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日