Frequent Subgraph Mining (FSM) is the key task in many graph mining and machine learning applications. Numerous systems have been proposed for FSM in the past decade. Although these systems show good performance for small patterns (with no more than four vertices), we found that they have difficulty in mining larger patterns. In this work, we propose a novel two-vertex exploration strategy to accelerate the mining process. Compared with the single-vertex exploration adopted by previous systems, our two-vertex exploration avoids the large memory consumption issue and significantly reduces the memory access overhead. We further enhance the performance through an index-based quick pattern technique that reduces the overhead of isomorphism checks, and a subgraph sampling technique that mitigates the issue of subgraph explosion. The experimental results show that our system achieves significant speedups against the state-of-the-art graph pattern mining systems and supports larger pattern mining tasks that none of the existing systems can handle.
翻译:经常海底采矿(FSM)是许多图表采矿和机器学习应用中的关键任务。在过去十年中,为密克罗尼西亚提出了许多系统。虽然这些系统对小型模式(不超过四个顶峰)表现良好,但我们发现它们在开采大型模式方面有困难。在这项工作中,我们提出了一个新的双垂直勘探战略,以加快采矿进程。与以往系统采用的单垂直勘探相比,我们的双垂直勘探避免了大型内存消耗问题,并大大减少了内存存存存管理。我们进一步通过基于指数的快速模式技术来提高性能,这种技术可以减少无形态检查的间接费用,并采用子图样取样技术来减轻子爆炸问题。实验结果显示,我们的系统对最新图样采矿系统取得了显著的加速,并支持了现有系统无法处理的更大模式采矿任务。