Long document re-ranking has been a challenging problem for neural re-rankers based on deep language models like BERT. Early work breaks the documents into short passage-like chunks. These chunks are independently mapped to scalar scores or latent vectors, which are then pooled into a final relevance score. These encode-and-pool methods however inevitably introduce an information bottleneck: the low dimension representations. In this paper, we propose instead to model full query-to-document interaction, leveraging the attention operation and modular Transformer re-ranker framework. First, document chunks are encoded independently with an encoder module. An interaction module then encodes the query and performs joint attention from the query to all document chunk representations. We demonstrate that the model can use this new degree of freedom to aggregate important information from the entire document. Our experiments show that this design produces effective re-ranking on two classical IR collections Robust04 and ClueWeb09, and a large-scale supervised collection MS-MARCO document ranking.
翻译:长期的文档重新排序对于基于深语言模型(如 BERT) 的神经再排序者来说是一个具有挑战性的问题。 早期的工作将文档破解成短通道类块块。 这些块块被独立地绘制成卡路里分数或潜在矢量, 然后被汇集到最终的相关性分数中。 这些编码和集合方法不可避免地会引入信息瓶颈: 低维度表达。 在本文中, 我们提议建模完整的查询到文件互动, 利用关注操作和模块变换器重新排序框架。 首先, 文档块将独立编码成一个编码模块。 一个互动模块然后将查询编码, 从查询到所有文档块表示中共同关注。 我们证明该模型可以使用这一新的自由度来汇总整个文档中的重要信息。 我们的实验显示, 这个设计可以有效地重新排在两个经典 IR 收藏 Robuust04 和 ClueWeb09 上, 以及一个大规模监管的 MS- MARCO 文档排序 。