Machine comprehension is a representative task of natural language understanding. Typically, we are given context paragraph and the objective is to answer a question that depends on the context. Such a problem requires to model the complex interactions between the context paragraph and the question. Lately, attention mechanisms have been found to be quite successful at these tasks and in particular, attention mechanisms with attention flow from both context-to-question and question-to-context have been proven to be quite useful. In this paper, we study two state-of-the-art attention mechanisms called Bi-Directional Attention Flow (BiDAF) and Dynamic Co-Attention Network (DCN) and propose a hybrid scheme combining these two architectures that gives better overall performance. Moreover, we also suggest a new simpler attention mechanism that we call Double Cross Attention (DCA) that provides better results compared to both BiDAF and Co-Attention mechanisms while providing similar performance as the hybrid scheme. The objective of our paper is to focus particularly on the attention layer and to suggest improvements on that. Our experimental evaluations show that both our proposed models achieve superior results on the Stanford Question Answering Dataset (SQuAD) compared to BiDAF and DCN attention mechanisms.
翻译:通常,我们被给上下文段落,目标是回答一个取决于背景的问题。这样一个问题需要模拟上下文段落和问题之间的复杂互动。最近,人们发现注意机制在这些任务中相当成功,特别是关注机制,从上下文到问题和问题到文字都得到了关注,事实证明注意机制是非常有用的。在本文件中,我们研究了两个最先进的关注机制,即双调关注流动(BIDAF)和动态共同注意网络(DCN),并提出了一种混合计划,将这两个结构结合起来,使总体业绩得到改善。此外,我们还提出一个新的更简单的关注机制,即我们称之为双交叉注意(DCA),提供更好的结果,与BIDAF和共同注意机制相比,同时提供与混合计划类似的业绩。我们的文件的目的是特别侧重于关注层,并对此提出改进建议。我们的实验性评估表明,我们提议的模型在斯坦福问题解答数据系统(SQADAD)和DAF机制之间都取得了优异的结果。