Graph search is one of the most successful algorithmic trends in near neighbor search. Several of the most popular and empirically successful algorithms are, at their core, a simple walk along a pruned near neighbor graph. Such algorithms consistently perform at the top of industrial speed benchmarks for applications such as embedding search. However, graph traversal applications often suffer from poor memory access patterns, and near neighbor search is no exception to this rule. Our measurements show that popular search indices such as the hierarchical navigable small-world graph (HNSW) can have poor cache miss performance. To address this problem, we apply graph reordering algorithms to near neighbor graphs. Graph reordering is a memory layout optimization that groups commonly-accessed nodes together in memory. We present exhaustive experiments applying several reordering algorithms to a leading graph-based near neighbor method based on the HNSW index. We find that reordering improves the query time by up to 40%, and we demonstrate that the time needed to reorder the graph is negligible compared to the time required to construct the index.
翻译:图形搜索是近邻搜索中最成功的算法趋势之一。 一些最受欢迎、经验上最成功的算法, 其核心是沿近邻图中一个被剪切的图象进行简单的走动。 这些算法在嵌入搜索等应用程序的工业速度基准中始终处于顶端。 然而, 图形穿行应用程序通常受到记忆访问模式差的影响, 近邻搜索也不例外。 我们的测量显示, 诸如等级可航行的小世界图( HNSW) 等流行搜索指数可能会有差的缓存误差性能。 为了解决这个问题, 我们用图表重定算算算算算法到近邻图。 图表重新排序是一个记忆布局优化, 将常见的节点组合在记忆中。 我们根据 HNSW 指数对以领先的图形为基础的领先方法进行多次重新排序实验。 我们发现, 重新排序将查询时间提高到40%, 并且我们证明, 与构建索引所需的时间相比, 重新排序所需的时间是微不足道的。