Unsupervised binary representation allows fast data retrieval without any annotations, enabling practical application like fast person re-identification and multimedia retrieval. It is argued that conflicts in binary space are one of the major barriers to high-performance unsupervised hashing as current methods failed to capture the precise code conflicts in the full domain. A novel relaxation method called Shuffle and Learn is proposed to tackle code conflicts in the unsupervised hash. Approximated derivatives for joint probability and the gradients for the binary layer are introduced to bridge the update from the hash to the input. Proof on $\epsilon$-Convergence of joint probability with approximated derivatives is provided to guarantee the preciseness on update applied on the mutual information. The proposed algorithm is carried out with iterative global updates to minimize mutual information, diverging the code before regular unsupervised optimization. Experiments suggest that the proposed method can relax the code optimization from local optimum and help to generate binary representations that are more discriminative and informative without any annotations. Performance benchmarks on image retrieval with the unsupervised binary code are conducted on three open datasets, and the model achieves state-of-the-art accuracy on image retrieval task for all those datasets. Datasets and reproducible code are provided.
翻译:无人监督的二进制代表器允许在没有附加说明的情况下快速检索数据,使快速的人重新识别和多媒体检索等实际应用成为了快速数据。 认为二进制空间的冲突是高性能、 不受监督的散列的主要障碍之一, 因为当前方法无法在全域捕捉精确的代码冲突。 提议采用名为 Shuffle and Learning 的新式放松法来解决未受监督的散列中的代码冲突。 实验表明, 拟议的方法可以放松代码优化, 使代码优化从本地最佳到双进层的双进制, 帮助生成更具有歧视性和信息的二进制表达器。 提供与近似衍生物联合概率的 $\ epsilon- concurgergence 证明, 以保证对共同信息应用的更新准确性。 拟议的算法是用反复的全球性更新来尽量减少相互信息, 在不受监督的优化之前将代码差异。 实验表明, 拟议的方法可以放松本地最优化的代码, 有助于生成更具有歧视性的二进式表达器, 而无需说明。 与未受监督的双进制的二进制的二进制代码的图像的图像检索功能的性基准的性基准是在三个开放数据检索中进行。