Existing pre-training methods for extractive Question Answering (QA) generate cloze-like queries different from natural questions in syntax structure, which could overfit pre-trained models to simple keyword matching. In order to address this problem, we propose a novel Momentum Contrastive pRe-training fOr queStion anSwering (MCROSS) method for extractive QA. Specifically, MCROSS introduces a momentum contrastive learning framework to align the answer probability between cloze-like and natural query-passage sample pairs. Hence, the pre-trained models can better transfer the knowledge learned in cloze-like samples to answering natural questions. Experimental results on three benchmarking QA datasets show that our method achieves noticeable improvement compared with all baselines in both supervised and zero-shot scenarios.
翻译:现有的抽取式问答预训练方法生成与自然语言不同的填空式查询,这可能导致预训练模型过度拟合简单的关键字匹配。为解决这个问题,我们提出了一种新颖的动量对比预训练用于问答(MCROSS)方法。具体而言,MCROSS引入了一个动量对比学习框架,以对齐填空式和自然语言查询-段落样本对之间的答案概率。因此,预训练模型可以更好地将在填空式样本中学习到的知识转移到回答自然问题上。在三个基准问答数据集上的实验结果表明,我们的方法在有监督和零样本情况下相对于所有基线都取得了显着的提高。