Transformer-based pre-trained language models such as BERT have achieved remarkable results in Semantic Sentence Matching. However, existing models still suffer from insufficient ability to capture subtle differences. Minor noise like word addition, deletion, and modification of sentences may cause flipped predictions. To alleviate this problem, we propose a novel Dual Attention Enhanced BERT (DABERT) to enhance the ability of BERT to capture fine-grained differences in sentence pairs. DABERT comprises (1) Dual Attention module, which measures soft word matches by introducing a new dual channel alignment mechanism to model affinity and difference attention. (2) Adaptive Fusion module, this module uses attention to learn the aggregation of difference and affinity features, and generates a vector describing the matching details of sentence pairs. We conduct extensive experiments on well-studied semantic matching and robustness test datasets, and the experimental results show the effectiveness of our proposed method.
翻译:在语义比对中,BERT等以变异器为基础的预先培训语言模型在语义比对中取得了显著成果。然而,现有模型仍然缺乏捕捉微妙差异的足够能力。词添加、删除和修改句子等微小噪音可能会引起翻转预测。为了缓解这一问题,我们提议了一个新的双重注意增强BERT(DABERT),以提高BERT(DABERT)在量刑配对中捕捉细微差异的能力。DABERT包括:(1)双注意模块,该模块通过引入一个新的双频道对齐机制来测量软字匹配,以模拟亲和差异关注模式。(2)适应融合模块,该模块利用注意力学习差异和亲近性特征的汇总,并生成一个矢量,描述对句配对的匹配细节。我们就经过良好研究的语义匹配和稳健健的测试数据集进行了广泛的实验,实验结果显示了我们拟议方法的有效性。</s>