DGA-Net DGA-Net DGA-Net 动态高斯关注网络判刑语义匹配网络 (DGA-Net Dynamic Gaussian Attention Network for Sentence Semantic Matching)

Sentence semantic matching requires an agent to determine the semantic relation between two sentences, where much recent progress has been made by the advancement of representation learning techniques and inspiration of human behaviors. Among all these methods, attention mechanism plays an essential role by selecting important parts effectively. However, current attention methods either focus on all the important parts in a static way or only select one important part at one attention step dynamically, which leaves a large space for further improvement. To this end, in this paper, we design a novel Dynamic Gaussian Attention Network (DGA-Net) to combine the advantages of current static and dynamic attention methods. More specifically, we first leverage pre-trained language model to encode the input sentences and construct semantic representations from a global perspective. Then, we develop a Dynamic Gaussian Attention (DGA) to dynamically capture the important parts and corresponding local contexts from a detailed perspective. Finally, we combine the global information and detailed local information together to decide the semantic relation of sentences comprehensively and precisely. Extensive experiments on two popular sentence semantic matching tasks demonstrate that our proposed DGA-Net is effective in improving the ability of attention mechanism.

翻译：判决的语义匹配需要一个代理来确定两句之间的语义关系,因为最近通过促进代表性学习技巧和激励人类行为而取得了许多进展。在所有这些方法中,关注机制通过有效选择重要部分发挥着至关重要的作用。然而,当前关注方法要么以静态方式关注所有重要部分,要么只是以一个关注步骤选择一个重要部分,这为进一步改进留下很大的空间。为此,我们在本文件中设计了一个新的动态高斯注意网络(DGA-Net),以结合当前静态和动态关注方法的优势。更具体地说,我们首先利用预先培训的语言模式来编码输入句子,并从全球角度构建语义表达方式。然后,我们开发动态高斯注意(DGA-Net),以便从详细的角度动态地捕捉重要部分和相应的当地背景。最后,我们将全球信息与详细的地方信息结合起来,以便全面准确地决定判决的语义关系。关于两个流行语义匹配任务的广泛实验表明,我们提议的DGA-Net在提高关注能力方面是有效的。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

数字化健康白皮书，17页pdf

专知会员服务

109+阅读 · 2021年1月6日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【AAAI 2020】双曲图注意力网络，Hyperbolic Graph Attention Network

专知会员服务

94+阅读 · 2020年6月15日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日