K-nearest neighbor search is one of the fundamental tasks in various applications and the hierarchical navigable small world (HNSW) has recently drawn attention in large-scale cloud services, as it easily scales up the database while offering fast search. On the other hand, a computational storage device (CSD) that combines programmable logic and storage modules on a single board becomes popular to address the data bandwidth bottleneck of modern computing systems. In this paper, we propose a computational storage platform that can accelerate a large-scale graph-based nearest neighbor search algorithm based on SmartSSD CSD. To this end, we modify the algorithm more amenable on the hardware and implement two types of accelerators using HLS- and RTL-based methodology with various optimization methods. In addition, we scale up the proposed platform to have 4 SmartSSDs and apply graph parallelism to boost the system performance further. As a result, the proposed computational storage platform achieves 75.59 query per second throughput for the SIFT1B dataset at 258.66W power dissipation, which is 12.83x and 17.91x faster and 10.43x and 24.33x more energy efficient than the conventional CPU-based and GPU-based server platform, respectively. With multi-terabyte storage and custom acceleration capability, we believe that the proposed computational storage platform is a promising solution for cost-sensitive cloud datacenters.
翻译:K- 近邻搜索是各种应用中的基本任务之一,而上层可导航的小世界(HNSW)最近在大型云服务中引起人们的注意,因为它在提供快速搜索的同时很容易地扩大数据库。另一方面,一个计算存储装置(CSD),将一个板上可编程的逻辑模块和存储模块结合起来,在单一板上将可编程的逻辑模块和存储模块结合起来,从而变得很受欢迎,以解决现代计算系统的数据带宽瓶颈问题。在本文件中,我们提议了一个计算存储平台,可以加速基于智能SSDSD CSD的大型图形性近邻搜索算法(HNSW),为此,我们修改了硬件上更容易使用的算法,并以各种优化方法使用基于 HLS- 和 RTL 的加速器方法实施两种加速器。此外,我们扩大拟议平台的规模,将拥有4 SmartSDSDSD, 并应用图形平行的平行功能来进一步提升系统运行。结果,拟议的计算存储平台在258. 66W 的SD- 电源解中,这是12. 83x C91x 的云式存储平台和10.43 的快速存储平台比标准更高效的存储平台,我们更相信GPU- PU- 和高的存储平台。