Querying cohesive subgraphs on temporal graphs with various time constraints has attracted intensive research interests recently. In this paper, we study a novel Temporal k-Core Query (TCQ) problem: given a time interval, find all distinct k-cores that exist within any subintervals from a temporal graph, which generalizes the previous historical k-core query. This problem is challenging because the number of subintervals increases quadratically to the span of time interval. For that, we propose a novel Temporal Core Decomposition (TCD) algorithm that decrementally induces temporal k-cores from the previously induced ones and thus reduces "intra-core" redundant computation significantly. Then, we introduce an intuitive concept named Tightest Time Interval (TTI) for temporal k-core, and design an optimization technique with theoretical guarantee that leverages TTI as a key to predict which subintervals will induce duplicated k-cores and prunes the subintervals completely in advance, thereby eliminating "inter-core" redundant computation. The complexity of optimized TCD (OTCD) algorithm no longer depends on the span of query time interval but only the scale of final results, which means OTCD algorithm is scalable. Moreover, we propose a compact in-memory data structure named Temporal Edge List (TEL) to implement OTCD algorithm efficiently in physical level with bounded memory requirement. TEL organizes temporal edges in a "timeline" and can be updated instantly when new edges arrive, and thus our approach can also deal with dynamic temporal graphs. We compare OTCD algorithm with the incremental historical k-core query on several real-world temporal graphs, and observe that OTCD algorithm outperforms it by three orders of magnitude, even though OTCD algorithm needs none precomputed index.
翻译:近期,使用各种时间约束条件查询时间图上的连通子图已经引起了广泛关注。本文研究了一个新颖的时间k-核查询(Temporal k-Core Query, TCQ)问题:给定一个时间区间,在时间图中找到所有存在于任一子区间中的不同k-核,这个问题推广了先前的历史k-核查询。这个问题具有挑战性,因为子区间的数量随时间区间的跨度呈二次方增长。因此,我们提出了一种新颖的时间核分解(Temporal Core Decomposition, TCD)算法。该算法从先前获得的核中递减地诱导时间k-核,因此显著减少了“核内”的冗余计算量。然后,我们引入了一种名为最紧密时间区间(Tightest Time Interval, TTI)的直观概念,用作时间k-核的优化策略,并通过TTI预测哪些子区间会诱导出重复的k-核,并提前完全剪除这些子区间,从而消除“核间”的冗余计算。我们优化的TCD(OTCD)算法的复杂度不再取决于查询时间区间的跨度,而仅取决于最终结果的规模,这意味着OTCD算法是可扩展的。此外,我们还提出了一种名为时间边缘列表(Temporal Edge List, TEL)的紧凑型内存数据结构,以有限的内存要求高效地实现OTCD算法。 TEL按照“时间轴”组织时间边缘,并可以在新边缘到来时立即更新。因此,我们的方法还可以处理动态时间图。我们将OTCD算法与增量历史k-核查询在几组真实时间图上进行比较,并观察到OTCD算法的性能优于它三个数量级,即使OTCD算法不需要预先计算索引。