Modern neural encoders offer unprecedented text-image retrieval (TIR) accuracy. However, their high computational cost impedes an adoption to large-scale image searches. We propose a novel image ranking algorithm that uses a cascade of increasingly powerful neural encoders to progressively filter images by how well they match a given text. Our algorithm reduces lifetime TIR costs by over 3x.
翻译:现代神经编码器提供了前所未有的文本-图像检索(TIR)准确性。然而,它们的高计算成本阻碍了它们在大规模图像搜索中的应用。我们提出了一种新的图像排名算法,它使用一系列越来越强大的神经编码器来逐步通过它们与给定文本的匹配程度来过滤图像。我们的算法可以将TIR计算成本降低3倍以上。