Graph analysis involves a high number of random memory access patterns. Earlier research has shownthat the cache miss latency is responsible for more than half of the graph processing time, with the CPU execution having the smaller share. There has been significant study on decreasing the CPU computing time for example, by employing better cache prefetching and replacement policies. In thispaper, we study the various methods that do so by attempting to decrease the CPU cache miss ratio.Graph Reordering attempts to exploit the power-law distribution of graphs -- few sparsely-populated vertices in the graph have high number of connections -- to keep the frequently accessed vertices together locally and hence decrease the cache misses. However, reordering the graph by keeping the hot vertices together may affect the spatial locality of the graph, and thus add to the total CPU compute time.Also, we also need to have a control over the total reordering time and its inverse relation with thefinal CPU execution timeIn order to exploit this trade-off between reordering as per vertex hotness and spatial locality, we introduce the light-weight Community-based Reordering. We attempt to maintain the community-structureof the graph by storing the hot-members in the community locally together. The implementation also takes into consideration the impact of graph diameter on the execution time. We compare our implementation with other reordering implementations and find a significantly better result on five graph processing algorithms: BFS, CC, CCSV, PR and BC. Lorder achieved speed-up of upto 7x and an average speed-up of 1.2x as compared to other reordering algorithms
翻译:图片分析包含大量随机存储存取模式。早期的研究显示,缓存误留错位是图处理时间一半以上的一半以上的原因,而CPU执行的比例较小。例如,对降低CPU计算时间进行了大量研究,例如,采用更好的缓存预拉和替换政策来减少CPU计算时间。在本文件中,我们通过尝试降低CPU缓存误差比率来研究这样做的各种方法。格子重新排序尝试利用图表的权力法分布 -- -- 平方图中鲜为人知的悬浮符数量众多 -- -- 以保持经常访问的悬浮在本地,从而减少缓存误差。然而,通过保持热的悬浮点来重新排序CPU计算时间,从而增加CPU缓存误差比率。因此,我们还需要控制总调整时间及其与最终CPU执行时间的反比值。为了利用这一交易,将经常访问的峰值调整与空间偏热点联系起来,从而减少缓存误差。但是,我们引入了通过保持热点头的电路段调整图表执行速度,我们大幅调整了共同体的递增速度,从而测量了共同体执行。