Nearest neighbor search supports important applications in many domains, such as database, machine learning, computer vision. Since the computational cost for accurate search is too high, the community turned to the research of approximate nearest neighbor search (ANNS). Among them, graph-based algorithm is one of the most important branches. Research by Fu et al. shows that the algorithms based on Monotonic Search Network (MSNET), such as NSG and NSSG, have achieved the state-of-the-art search performance in efficiency. The MSNET is dedicated to achieving monotonic search with minimal out-degree of nodes to pursue high efficiency. However, the current MSNET designs did not optimize the probability of the monotonic search, and the lower bound of the probability is only 50%. If they fail in monotonic search stage, they have to suffer tremendous backtracking cost to achieve the required accuracy. This will cause performance problems in search efficiency. To address this problem, we propose (r,p)-MSNET, which achieves guaranteed probability on monotonic search. Due to the high building complexity of a strict (r,p)-MSNET, we propose TBSG, which is an approximation with low complexity. Experiment conducted on four million-scaled datasets show that TBSG outperforms existing state-of-the-art graph-based algorithms in search efficiency. Our code has been released on Github.
翻译:近邻搜索支持了许多领域的重要应用, 如数据库、 机器学习、 计算机视觉等。 由于精确搜索的计算成本太高, 社区转向了近邻搜索( ANNS) 的研究。 其中, 基于图形的算法是最重要的分支之一。 Fu 等人的研究显示, 基于单声搜索网络的算法( MSNET), 如 NSG 和NSSG 等, 已经实现了效率最先进的搜索性能。 MSNET 致力于实现单声波搜索, 且节点的外度最低, 以追求更高的效率。 然而, 当前的MSNET 设计并没有优化单声搜索的概率, 而概率的下限只有50 % 。 如果在单声搜索阶段失败, 他们必须承受巨大的回溯跟踪成本才能达到所要求的准确性。 为了解决这个问题, 我们建议 (r, p) MSNET, 保证单声搜索的概率。 但是, 目前的MSNET 设计没有优化(r, p) MSNET) 的建筑复杂度, 而概率范围更小的概率只有50 % 。 我们提议在四度的GTBSG 的模型上显示, 。 正在推出的GLBSDSG 。