Nearest neighbor search is to find the data points in the database such that the distances from them to the query are the smallest, which is a fundamental problem in various domains, such as computer vision, recommendation systems and machine learning. Hashing is one of the most widely used methods for its computational and storage efficiency. With the development of deep learning, deep hashing methods show more advantages than traditional methods. In this paper, we present a comprehensive survey of the deep hashing algorithms. Specifically, we categorize deep supervised hashing methods into pairwise similarity preserving, multiwise similarity preserving, implicit similarity preserving, classification-oriented preserving as well as quantization according to the manners of preserving the similarities. In addition, we also introduce some other topics such as deep unsupervised hashing and multi-modal deep hashing methods. Meanwhile, we also present some commonly used public datasets and the scheme to measure the performance of deep hashing algorithms. Finally, we discussed some potential research directions in conclusion.
翻译:近邻搜索的目的是在数据库中找到数据点, 以便从它们到查询的距离最小, 这是计算机视觉、 推荐系统和机器学习等不同领域的一个基本问题。 散列是计算和存储效率最广泛使用的方法之一 。 随着深层学习的发展, 深度散列方法比传统方法更具有优势 。 在本文中, 我们展示了对深度散列算法的全面调查 。 具体地说, 我们将深层监督散列方法分类为双向相似性保存、 多功能相似性保存、 隐含相似性保存、 以分类为导向的保存以及量化 。 此外, 我们还介绍了一些其他主题, 如深度未受控制的散列和多模式的散列方法 。 同时, 我们还介绍了一些常用的公共数据集以及测量深层散列算法的性能的图案 。 最后, 我们讨论了最后的一些潜在研究方向 。