Neural retrieval models have acquired significant effectiveness gains over the last few years compared to term-based methods. Nevertheless, those models may be brittle when faced to typos, distribution shifts or vulnerable to malicious attacks. For instance, several recent papers demonstrated that such variations severely impacted models performances, and then tried to train more resilient models. Usual approaches include synonyms replacements or typos injections -- as data-augmentation -- and the use of more robust tokenizers (characterBERT, BPE-dropout). To further complement the literature, we investigate in this paper adversarial training as another possible solution to this robustness issue. Our comparison includes the two main families of BERT-based neural retrievers, i.e. dense and sparse, with and without distillation techniques. We then demonstrate that one of the most simple adversarial training techniques -- the Fast Gradient Sign Method (FGSM) -- can improve first stage rankers robustness and effectiveness. In particular, FGSM increases models performances on both in-domain and out-of-domain distributions, and also on queries with typos, for multiple neural retrievers.
翻译:与基于术语的方法相比,神经检索模型在过去几年中取得了显著的效益。然而,这些模型在面对打字机、分销转移或易受恶意袭击时可能变得不易。例如,最近的一些论文表明,这些变异严重影响了模型的性能,然后试图培训更具弹性的模型。通常的方法包括同义词替换或打字注射 -- -- 作为数据增强 -- -- 以及使用更强健的代号器(CharacterBERT, BPE-dropout)。为了进一步补充文献,我们在本文件中调查对抗性培训,作为解决这种强健问题的另一种可能办法。我们的比较包括基于BERT的神经检索器的两个主要组合,即密度和分散性,以及不采用蒸馏技术。然后我们证明,最简单的对抗性培训技术之一 -- -- 快速渐进信号方法(FGSM) -- -- 能够改进第一阶段的排级的坚固性和有效性。特别是,FGSMM增加在内部和外部分布上的模型性能,以及用于多级检索器的检查。