Recent trends in business and technology (e.g., machine learning, social network analysis) benefit from storing and processing growing amounts of graph-structured data in databases and data science platforms. FPGAs as accelerators for graph processing with a customizable memory hierarchy promise solving performance problems caused by inherent irregular memory access patterns on traditional hardware (e.g., CPU). However, developing such hardware accelerators is yet time-consuming and difficult and benchmarking is non-standardized, hindering comprehension of the impact of memory access pattern changes and systematic engineering of graph processing accelerators. In this work, we propose a simulation environment for the analysis of graph processing accelerators based on simulating their memory access patterns. Further, we evaluate our approach on two state-of-the-art FPGA graph processing accelerators and show reproducibility, comparablity, as well as the shortened development process by an example. Not implementing the cycle-accurate internal data flow on accelerator hardware like FPGAs significantly reduces the implementation time, increases the benchmark parameter transparency, and allows comparison of graph processing approaches.
翻译:最近的商业和技术趋势(例如机器学习、社会网络分析)得益于数据库和数据科学平台中越来越多的图表结构数据储存和处理。FPGAs作为可定制的内存等级的图形处理加速器,有望解决传统硬件(例如CPU)内在的不规则内存访问模式造成的性能问题。然而,开发这种硬件加速器既费时又困难,基准也非标准化,妨碍了对内存访问模式变化的影响的理解,也妨碍了对图形处理加速器的系统工程。在这项工作中,我们提议了一个模拟环境,用于分析图形处理加速器,以模拟其内存访问模式为基础。此外,我们评估了我们关于两种最先进的FPGA图形处理加速器的方法,并展示了可复制性、兼容性以及一个实例的缩短的开发过程。没有在像FPGAs这样的加速器硬件上实施周期-准确的内部数据流,大大缩短了执行时间,提高了基准参数透明度,并能够比较图表处理方法。