BERT-based information retrieval models are expensive, in both time (query latency) and computational resources (energy, hardware cost), making many of these models impractical especially under resource constraints. The reliance on a query encoder that only performs tokenization and on the pre-processing of passage representations at indexing, has allowed the recently proposed TILDE method to overcome the high query latency issue typical of BERT-based models. This however is at the expense of a lower effectiveness compared to other BERT-based re-rankers and dense retrievers. In addition, the original TILDE method is characterised by indexes with a very high memory footprint, as it expands each passage into the size of the BERT vocabulary. In this paper, we propose TILDEv2, a new model that stems from the original TILDE but that addresses its limitations. TILDEv2 relies on contextualized exact term matching with expanded passages. This requires to only store in the index the score of tokens that appear in the expanded passages (rather than all the vocabulary), thus producing indexes that are 99% smaller than those of TILDE. This matching mechanism also improves ranking effectiveness by 24%, without adding to the query latency. This makes TILDEv2 the state-of-the-art passage re-ranking method for CPU-only environments, capable of maintaining query latency below 100ms on commodity hardware.
翻译:基于 BERT 的信息检索模型在时间( 快速通勤) 和计算资源( 能源、 硬件成本) 上都是昂贵的, 使得许多这些模型在资源限制下都变得不切实际。 依赖只执行代号的查询编码器, 以及在索引编制过程中对通道表示的预处理, 使得最近提议的 TILDE 方法能够克服基于 BERT 模型典型的基于 BERT 模型的高度查询延迟问题。 然而,这要以低于其他基于 BERT 的重新整理器和密集检索器( 能源、 硬件成本) 。 此外, 最初的 TILILDEDE方法还以具有非常高记忆足的指数为特征, 因为它将每个通道扩展到 BERT词汇表的大小。 在本文中,我们建议TILDEV2, 这是一种源自原 TILDEDED 的新的模型, 但它解决了它的局限性。 TILDEV 2 依靠与扩大通道的根据背景确切的准确的术语匹配。 这要求只将扩大段落( 而不是所有词汇, ) 所显示的标志的标志比值小的值小于99% 的硬值 。