HGMatch: 高效和平行子海相匹配的逐级匹配框架 (HGMatch: A Match-by-Hyperedge Framework for Efficient and Parallel Subhypergraph Matching)

Hypergraphs are generalisation of graphs in which a hyperedge can connect any number of vertices. It can describe n-ary relationships and high-order information among entities compared to conventional graphs. In this paper, we study the fundamental problem of subhypergraph matching in hypergraphs. Existing methods directly extend subgraph matching algorithms to the case of hypergraphs. However, this approach delays hyperedge verification and underutilises the high-order information in hypergraphs, which leads to large search space and high enumeration cost. Furthermore, with the growing size of hypergraphs, it is becoming hard to compute subhypergraph matching sequentially. Thus, we propose an efficient and parallel subhypergraph matching system, HGMatch, to handle subhypergraph matching in massive hypergraphs. We proposes a novel match-by-hyperedge framework to utilise high-order information in hypergraphs and uses set operations for efficient candidates generation. Moreover, we develop an optimised parallel execution engine in HGMatch based on the dataflow model, which features a task-based scheduler and fine-grained dynamic work stealing to achieve bounded memory execution and better load balancing. Experimental evaluation on 10 real-world datasets shows that HGMatch outperforms the extended version of the state-of-the-art subgraph matching algorithms (CFL, DAF, CECI and RapidMatch) by orders of magnitude when using a single thread, and achieves almost linear scalability when the number of threads increases.

翻译：超格是图表的概括化, 高端可以连接任何数量的脊椎。它可以描述实体与常规图形相比的 nary 关系和高端信息。在本文中, 我们研究高压下子血压匹配的根本问题。现有方法将子血压匹配算法直接扩展至高压。但是, 这种方法会延迟高端核查, 并低估高端高端信息, 从而导致巨大的搜索空间和高查点成本。此外, 随着高端图的不断增长, 很难按顺序对各实体进行子血压匹配。因此, 我们提出一个高效和平行的子血压匹配系统( HGMatch ), 以大型高压匹配法处理子血压匹配算法。我们提出一个新的按高端校准框架, 使用高端校准的操作来高效的候选人生成。此外, 我们根据数据流模型在HGMatch 开发一个优化的平行执行引擎, 以基于基于基于任务表和精确的直线匹配的直径匹配比。因此, 将快速的直径直径直径匹配的直径直线匹配系统匹配系统匹配系统运行。