Existing unsupervised hashing methods typically adopt a feature similarity preservation paradigm. As a result, they overlook the intrinsic similarity capacity discrepancy between the continuous feature and discrete hash code spaces. Specifically, since the feature similarity distribution is intrinsically biased (e.g., moderately positive similarity scores on negative pairs), the hash code similarities of positive and negative pairs often become inseparable (i.e., the similarity collapse problem). To solve this problem, in this paper a novel Similarity Distribution Calibration (SDC) method is introduced. Instead of matching individual pairwise similarity scores, SDC aligns the hash code similarity distribution towards a calibration distribution (e.g., beta distribution) with sufficient spread across the entire similarity capacity/range, to alleviate the similarity collapse problem. Extensive experiments show that our SDC outperforms the state-of-the-art alternatives on both coarse category-level and instance-level image retrieval tasks, often by a large margin. Code is available at https://github.com/kamwoh/sdc.
翻译:现有的未经监督的散列法通常采用特征相似性保存模式,因此,它们忽略了连续特性和离散散散散散列代码空间之间内在的相似性能力差异。具体地说,由于特征相似性分布在本质上是偏差的(例如,负对的正正相近分数),正对和负对的散列法相似性往往变得不可分离(即相似性崩溃问题)。为了解决这个问题,本文件采用了一种新的相似性分布校准(SDC)方法。SDC没有将个人相近性分数相匹配,而是将散列码相似性分布与校准分布(例如,乙型分布)相匹配,充分分布在整个相似性能力/范围,以缓解类似性崩溃问题。广泛的实验表明,我们的散分级和试级图像检索工作都比不上最先进的替代方法,通常使用大边距。代码见https://github.com/kamwoh/sdc。