Extractive question answering (ExQA) is an essential task for Natural Language Processing. The dominant approach to ExQA is one that represents the input sequence tokens (question and passage) with a pre-trained transformer, then uses two learned query vectors to compute distributions over the start and end answer span positions. These query vectors lack the context of the inputs, which can be a bottleneck for the model performance. To address this problem, we propose \textit{DyREx}, a generalization of the \textit{vanilla} approach where we dynamically compute query vectors given the input, using an attention mechanism through transformer layers. Empirical observations demonstrate that our approach consistently improves the performance over the standard one. The code and accompanying files for running the experiments are available at \url{https://github.com/urchade/DyReX}.
翻译:解答解答( ExQA) 是自然语言处理的一项基本任务 。 ExQA 的主要方法是一种代表输入序列符号( 问题和通过) 的方法, 配有预培训变压器, 然后使用两个学习的查询矢量来计算开始和结束回答跨度位置的分布。 这些查询矢量缺乏输入的背景, 这可能是模型性能的瓶颈。 为了解决这个问题, 我们建议 \ textit{ Dyrex}, 概括 \ textit{ vanilla} 方法, 即我们通过变压器层的注意机制, 动态计算输入的查询矢量。 经验性观测显示, 我们的方法持续改进了标准值的性能。 运行实验的代码和随附文件可以在\url{ https:// github. com/ urchade/ DyReX} 。