大型标签图中常见邻居模式 (Mining Frequent Neighborhood Patterns in Large Labeled Graphs)

Over the years, frequent subgraphs have been an important sort of targeted patterns in the pattern mining literatures, where most works deal with databases holding a number of graph transactions, e.g., chemical structures of compounds. These methods rely heavily on the downward-closure property (DCP) of the support measure to ensure an efficient pruning of the candidate patterns. When switching to the emerging scenario of single-graph databases such as Google Knowledge Graph and Facebook social graph, the traditional support measure turns out to be trivial (either 0 or 1). However, to the best of our knowledge, all attempts to redefine a single-graph support resulted in measures that either lose DCP, or are no longer semantically intuitive. This paper targets mining patterns in the single-graph setting. We resolve the "DCP-intuitiveness" dilemma by shifting the mining target from frequent subgraphs to frequent neighborhoods. A neighborhood is a specific topological pattern where a vertex is embedded, and the pattern is frequent if it is shared by a large portion (above a given threshold) of vertices. We show that the new patterns not only maintain DCP, but also have equally significant semantics as subgraph patterns. Experiments on real-life datasets display the feasibility of our algorithms on relatively large graphs, as well as the capability of mining interesting knowledge that is not discovered in prior works.

翻译：多年来, 频繁的子图一直是模式采矿文献中一种重要的目标模式, 大多数工作都与持有若干图表交易的数据库打交道, 例如化合物的化学结构。这些方法在很大程度上依赖于支持措施的向下关闭属性( DCP ), 以确保对候选模式进行有效的裁剪。当转换到谷歌知识图和脸书社会图等单一图形数据库的新兴情景时, 传统的支持措施显得微不足道( 0 或 1 ) 。然而, 据我们所知, 所有试图重新定义单一图表支持的尝试都导致一些措施, 要么失去 DCP, 要么不再具有直观性。本文针对的是单一绘图设置中的采矿模式。我们通过将采矿目标从经常的子图谱转换到频繁的邻居, 解决了“ DCP 直观性” 的困境。邻里是一个特定的表层模式, 并且如果被大量( 高于某一阈值), 则经常出现这种模式。我们显示, 新的模式不仅维持了我们以往的图像模式, 并且作为历史模型中的重要数据展示了我们之前的图层。