Combinatorial optimization (CO) has been a hot research topic because of its theoretic and practical importance. As a classic CO problem, deep hashing aims to find an optimal code for each data from finite discrete possibilities, while the discrete nature brings a big challenge to the optimization process. Previous methods usually mitigate this challenge by binary approximation, substituting binary codes for real-values via activation functions or regularizations. However, such approximation leads to uncertainty between real-values and binary ones, degrading retrieval performance. In this paper, we propose a novel Deep Momentum Uncertainty Hashing (DMUH). It explicitly estimates the uncertainty during training and leverages the uncertainty information to guide the approximation process. Specifically, we model bit-level uncertainty via measuring the discrepancy between the output of a hashing network and that of a momentum-updated network. The discrepancy of each bit indicates the uncertainty of the hashing network to the approximate output of that bit. Meanwhile, the mean discrepancy of all bits in a hashing code can be regarded as image-level uncertainty. It embodies the uncertainty of the hashing network to the corresponding input image. The hashing bit and image with higher uncertainty are paid more attention during optimization. To the best of our knowledge, this is the first work to study the uncertainty in hashing bits. Extensive experiments are conducted on four datasets to verify the superiority of our method, including CIFAR-10, NUS-WIDE, MS-COCO, and a million-scale dataset Clothing1M. Our method achieves the best performance on all of the datasets and surpasses existing state-of-the-art methods by a large margin.
翻译:组合优化( CO) 因其理论和实践重要性而成为一个热题。 作为典型的CO 问题, 深的混凝土旨在从有限的离散可能性中找到每种数据的最佳代码, 而离散性质则给优化进程带来巨大的挑战。 以往的方法通常通过二进制近似来缓解这一挑战, 通过激活功能或正规化来取代实际价值的二进制代码。 然而, 这种接近导致实际值和二进制值之间的不确定性, 降低检索性能。 在本文中, 我们提议了一个新的深调不固定的混凝土( DMUH ) 。 它明确估计了培训期间的不确定性, 并利用不确定性信息引导近距离进程。 具体地说, 我们通过测量一个散乱网络的产出与一个充满动力的网络之间的差异来模拟这种挑战。 每个位的偏差表明网络的不确定性与该部分的近似值。 同时, 一种有源代码中所有部分的平均差异可以被视为图像级的不确定性。 它反映了在培训过程中的不确定性, 将一个拥有的网络的不确定性化网络, 将比值更多的数据纳入到相应的图像。