Recently, a large number of neural mechanisms and models have been proposed for sequence learning, of which self-attention, as exemplified by the Transformer model, and graph neural networks (GNNs) have attracted much attention. In this paper, we propose an approach that combines and draws on the complementary strengths of these two methods. Specifically, we propose contextualized non-local neural networks (CN$^{\textbf{3}}$), which can both dynamically construct a task-specific structure of a sentence and leverage rich local dependencies within a particular neighborhood. Experimental results on ten NLP tasks in text classification, semantic matching, and sequence labeling show that our proposed model outperforms competitive baselines and discovers task-specific dependency structures, thus providing better interpretability to users.
翻译:最近,提出了大量用于序列学习的神经机制和模型,如变换模型和图形神经网络(GNNs)等,其中的自我关注引起了很大的注意。在本文中,我们提出了一种结合并借鉴这两种方法互补优势的方法。具体地说,我们提出了符合背景的非当地神经网络(CN$ ⁇ textbf{3 ⁇ $),这既能动态地构建一个特定任务的结构,一个判决结构,又能利用特定社区内丰富的当地依赖关系。在文本分类、语义匹配和序列标签方面十项NLP任务的实验结果显示,我们提议的模型比竞争性基线强,发现特定任务的依赖性结构,从而为用户提供更好的解释性。