In many real datasets such as social media streams and cyber data sources, graphs change over time through a graph update stream of edge insertions and deletions. Detecting critical patterns in such dynamic graphs plays an important role in various application domains such as fraud detection, cyber security, and recommendation systems for social networks. Given a dynamic data graph and a query graph, the continuous subgraph matching problem is to find all positive matches for each edge insertion and all negative matches for each edge deletion. The state-of-the-art algorithm TurboFlux uses a spanning tree of a query graph for filtering. However, using the spanning tree may have a low pruning power because it does not take into account all edges of the query graph. In this paper, we present a symmetric and much faster algorithm SymBi which maintains an auxiliary data structure based on a directed acyclic graph instead of a spanning tree, which maintains the intermediate results of bidirectional dynamic programming between the query graph and the dynamic graph. Extensive experiments with real and synthetic datasets show that SymBi outperforms the state-of-the-art algorithm by up to three orders of magnitude in terms of the elapsed time.
翻译:在许多真实的数据集中, 如社交媒体流和网络数据源, 图表会通过边缘插入和删除的图形更新流而随时间变化。 检测这些动态图形中的关键模式在欺诈检测、 网络安全和社交网络推荐系统等各种应用领域起着重要作用。 根据动态数据图表和查询图, 连续的子图匹配问题是要找到每个边缘插入的所有正匹配, 以及每个边缘删除的所有负匹配 。 最先进的TurboFlus 算法在过滤时使用一个查询图的横贯树。 但是, 使用横贯的树可能具有较低的调整能力, 因为它没有考虑到查询图的所有边缘 。 在本文中, 我们提出了一个对称和快速的算法 SymBi, 它维持一个辅助数据结构, 以定向的环图而不是覆盖树为基础, 维持在查询图和动态图之间的双向动态动态规划的中间结果 。 使用真实和合成的数据集进行的广泛实验显示 Symbi 超越了查询图上的三个星级的状态。