Sentence semantic matching requires an agent to determine the semantic relation between two sentences, which is widely used in various natural language tasks, such as Natural Language Inference (NLI), Paraphrase Identification (PI), and so on. Much recent progress has been made in this area, especially attention-based methods and pre-trained language model based methods. However, most of these methods focus on all the important parts in sentences in a static way and only emphasize how important the words are to the query, inhibiting the ability of attention mechanism. In order to overcome this problem and boost the performance of attention mechanism, we propose a novel dynamic re-read attention, which can pay close attention to one small region of sentences at each step and re-read the important parts for better sentence representations. Based on this attention variation, we develop a novel Dynamic Re-read Network (DRr-Net) for sentence semantic matching. Moreover, selecting one small region in dynamic re-read attention seems insufficient for sentence semantics, and employing pre-trained language models as input encoders will introduce incomplete and fragile representation problems. To this end, we extend DRrNet to Locally-Aware Dynamic Re-read Attention Net (LadRa-Net), in which local structure of sentences is employed to alleviate the shortcoming of Byte-Pair Encoding (BPE) in pre-trained language models and boost the performance of dynamic reread attention. Extensive experiments on two popular sentence semantic matching tasks demonstrate that DRr-Net can significantly improve the performance of sentence semantic matching. Meanwhile, LadRa-Net is able to achieve better performance by considering the local structures of sentences. In addition, it is exceedingly interesting that some discoveries in our experiments are consistent with some findings of psychological research.
翻译:句子语义匹配需要一个代理方来确定两句之间的语义关系。 两句在各种自然语言任务中广泛使用,例如自然语言推断(NLI) 、 参数识别(PI) 等自然语言任务中广泛使用的语义关系。 这一领域最近取得了许多进展, 特别是基于关注的方法和经过预先训练的语言模式。 然而, 这些方法大多以静态的方式侧重于句子中所有重要部分, 并且只强调词对查询的重要性, 从而抑制关注机制的能力。 为了克服这一问题, 并提升关注机制的性能, 我们提议一个全新的动态重新阅读关注点, 它可以密切关注每个步骤的一个小区域, 并密切关注一个小区域, 并仔细地关注每个步骤的句子区域, 并重新阅读重要部分 。 如此, 我们将快速智能网络的短期实验性能, 将本地性能定位的快速性能展示给本地性能 。