Locality sensitive hashing pictures a list-wise sorting problem. Its testing metrics, e.g., mean-average precision, count on a sorted candidate list ordered by pair-wise code similarity. However, scarcely does one train a deep hashing model with the sorted results end-to-end because of the non-differentiable nature of the sorting operation. This inconsistency in the objectives of training and test may lead to sub-optimal performance since the training loss often fails to reflect the actual retrieval metric. In this paper, we tackle this problem by introducing Naturally-Sorted Hashing (NSH). We sort the Hamming distances of samples' hash codes and accordingly gather their latent representations for self-supervised training. Thanks to the recent advances in differentiable sorting approximations, the hash head receives gradients from the sorter so that the hash encoder can be optimized along with the training procedure. Additionally, we describe a novel Sorted Noise-Contrastive Estimation (SortedNCE) loss that selectively picks positive and negative samples for contrastive learning, which allows NSH to mine data semantic relations during training in an unsupervised manner. Our extensive experiments show the proposed NSH model significantly outperforms the existing unsupervised hashing methods on three benchmarked datasets.
翻译:本地敏感 hash 图片是列表排序问题 。 它的测试度量, 例如平均精确度, 依靠按对称代码相似性排序的分类候选名单 。 但是, 由于排序操作的分类性质无法区分, 本地敏感 hash 偏向于列表的图像。 培训和测试目标的这种不一致可能导致亚优性性性能, 因为培训损失往往不能反映实际的检索度量。 在本文中, 我们通过引入自然 Sorted Hashing( NSH) 来解决这个问题。 我们排序样本的仓储代码的仓储距离, 并相应地收集它们用于自我监督培训的潜在代表。 由于最近在可分类的排序近似差上的进展, 仓头会从排序中获取梯度, 以便与培训程序一起优化 。 此外, 我们描述了一个新颖的节能调调调的 Eststimation (Sordencation Estimation (SordNCE), 我们通过有选择性地选择地选择地选择地选取正确和负面的仓储规则, 在对比性模型中, 显示现有三类比式的模型中, 我们的SNS的模型的模型中, 能够大量地展示现有的数据。