Hashing has been widely researched to solve the large-scale approximate nearest neighbor search problem owing to its time and storage superiority. In recent years, a number of online hashing methods have emerged, which can update the hash functions to adapt to the new stream data and realize dynamic retrieval. However, existing online hashing methods are required to update the whole database with the latest hash functions when a query arrives, which leads to low retrieval efficiency with the continuous increase of the stream data. On the other hand, these methods ignore the supervision relationship among the examples, especially in the multi-label case. In this paper, we propose a novel Fast Online Hashing (FOH) method which only updates the binary codes of a small part of the database. To be specific, we first build a query pool in which the nearest neighbors of each central point are recorded. When a new query arrives, only the binary codes of the corresponding potential neighbors are updated. In addition, we create a similarity matrix which takes the multi-label supervision information into account and bring in the multi-label projection loss to further preserve the similarity among the multi-label data. The experimental results on two common benchmarks show that the proposed FOH can achieve dramatic superiority on query time up to 6.28 seconds less than state-of-the-art baselines with competitive retrieval accuracy.
翻译:由于时间和存储优势,已广泛研究过大量近邻搜索问题。近年来,出现了一些在线散列方法,这些方法可以更新散列函数,以适应新流数据并实现动态检索。然而,现有的网上散列方法需要在查询到来时以最新的散列函数更新整个数据库,从而导致检索效率随着流数据的持续增加而降低检索效率。另一方面,这些方法忽略了各实例之间的监督关系,特别是在多标签案例中。在本文件中,我们提议了一个新的快速在线散列(FOH)方法,该方法只能更新数据库一小部分的二进制代码。具体地说,我们首先要建立一个查询库,记录每个中心点最近的邻里。在新查询到来时,只更新相应潜在邻里的二进制代码。此外,我们创建了一个类似性矩阵,将多标签监督信息考虑在内,并在多标签预测中引入损失,以进一步保存多标签在线散列数据之间的相似性。在两个通用基准点上,实验性结果显示F-25号基准下,在两个共同基准下,比F-25号基准下,在两个共同基准下,可以实现最短的升级的升级。