Subgraph isomorphism is a well-known NP-hard problem that is widely used in many applications, such as social network analysis and query over the knowledge graph. Due to the inherent hardness, its performance is often a bottleneck in various real-world applications. Therefore, we address this by designing an efficient subgraph isomorphism algorithm leveraging features of GPU architecture, such as massive parallelism and memory hierarchy. Existing GPU-based solutions adopt a two-step output scheme, performing the same join process twice in order to write intermediate results concurrently. They also lack GPU architecture-aware optimizations that allow scaling to large graphs. In this paper, we propose a GPU-friendly subgraph isomorphism algorithm, GSI. Different from existing edge join-based GPU solutions, we propose a Prealloc-Combine strategy based on the vertex-oriented framework, which avoids joining-twice in existing solutions. Also, a GPU-friendly data structure (called PCSR) is proposed to represent an edge-labeled graph. Extensive experiments on both synthetic and real graphs show that GSI outperforms the state-of-the-art algorithms by up to several orders of magnitude and has good scalability with graph size scaling to hundreds of millions of edges.
翻译:子系统形态是一个众所周知的NP-硬问题,在许多应用中广泛使用,例如社交网络分析和知识图查询。由于内在的硬性,其性能往往是各种现实应用中的瓶颈。因此,我们通过设计高效的子系统形态学算法来解决这个问题,利用GPU结构的特征,例如大规模平行和记忆等级。现有的基于GPU的解决方案采用两步制输出方案,两次执行相同的联合进程,以便同时写出中间结果。它们也缺乏允许缩放大图的GPU结构优化。在本文中,我们提议了一种对GPU友好的子系统形态算法,即GSI。与现有的以联合为基础的GPUPU解决方案边缘不同的是,我们提议了一种Prealloc-Combine战略,它避免在现有解决方案中连成两边。此外,还提议一种对GPU-PU友好的数据结构(称为PCSR)代表一个边缘标记的图表。在合成和真实的图形上进行广泛的实验,表明GPUPI以几百万个比例级的平面结构,通过几百万张的平面的平面的平面级压将GSI压压到成成几百万的平面级。