Approximate nearest neighbour (ANN) search is one of the most important problems in computer science fields such as data mining or computer vision. In this paper, we focus on ANN for high-dimensional binary vectors and we propose a simple yet powerful search method that uses Random Binary Search Trees (RBST). We apply our method to a dataset of 1.25M binary local feature descriptors obtained from a real-life image-based localisation system provided by Google as a part of Project Tango. An extensive evaluation of our method against the state-of-the-art variations of Locality Sensitive Hashing (LSH), namely Uniform LSH and Multi-probe LSH, shows the superiority of our method in terms of retrieval precision with performance boost of over 20%
翻译:近邻搜索(ANN)是数据挖掘或计算机视觉等计算机科学领域最重要的问题之一。在本文中,我们侧重于高维二进矢量的ANN,我们提出了使用随机二进搜索树(RBST)的简单而有力的搜索方法。我们将我们的方法应用到一组数据集中:1.25M二进制本地特征描述器,该数据集来自谷歌作为Tango项目的一部分提供的基于真实生活的图像定位系统,由谷歌提供,作为Tango项目的一部分。我们针对当地敏感散列(LSH)的最新变化,即统一LSH和多方案LSH,对方法进行了广泛的评估,显示我们的方法在检索精确度方面优于20%以上的性能加速率。