k-nearest neighbor graph is a key data structure in many disciplines such as manifold learning, machine learning and information retrieval, etc. NN-Descent was proposed as an effective solution for the graph construction problem. However, it cannot be directly transplanted to GPU due to the intensive memory accesses required in the approach. In this paper, NN-Descent has been redesigned to adapt to the GPU architecture. In particular, the number of memory accesses has been reduced significantly. The redesign fully exploits the parallelism of the GPU hardware. In the meantime, the genericness as well as the simplicity of NN-Descent are well-preserved. In addition, a simple but effective k-NN graph merge approach is presented. It allows two graphs to be merged efficiently on GPUs. More importantly, it makes the construction of high-quality k-NN graphs for out-of-GPU-memory datasets tractable. The results show that our approach is 100-250x faster than single-thread NN-Descent and is 2.5-5x faster than existing GPU-based approaches.
翻译:k- 近邻图形是许多学科的关键数据结构,如多重学习、机器学习和信息检索等。 NN- 白是作为图形构造问题的有效解决办法提出的。 但是,由于方法需要大量的内存访问,它无法直接移植到 GPU 。 在本文中, NN- 白已经重新设计,以适应 GPU 结构。 特别是, 内存访问量已大大减少。 重新设计充分利用了 GPU 硬件的平行功能。 同时, NN- 白的通用性以及简单性都得到了很好的保护。 此外, 提出了一个简单而有效的 k- NN 图形合并方法。 它允许两种图形在 GPU 上高效地合并。 更重要的是, 它使得为 GPU- 模类数据集建造高品质的 k- NN 图形成为可牵引力的。 结果显示, 我们的方法比单读 NN- 白速度快100- 250x, 比现有的GPU 方法快2.5-5x 。