A great variety of complex systems, from user interactions in communication networks to transactions in financial markets, can be modeled as temporal graphs consisting of a set of vertices and a series of timestamped and directed edges. Temporal motifs are generalized from subgraph patterns in static graphs which consider edge orderings and durations in addition to topologies. Counting the number of occurrences of temporal motifs is a fundamental problem for temporal network analysis. However, existing methods either cannot support temporal motifs or suffer from performance issues. Moreover, they cannot work in the streaming model where edges are observed incrementally over time. In this paper, we focus on approximate temporal motif counting via random sampling. We first propose two sampling algorithms for temporal motif counting in the offline setting. The first is an edge sampling (ES) algorithm for estimating the number of instances of any temporal motif. The second is an improved edge-wedge sampling (EWS) algorithm that hybridizes edge sampling with wedge sampling for counting temporal motifs with $3$ vertices and $3$ edges. Furthermore, we propose two algorithms to count temporal motifs incrementally in temporal graph streams by extending the ES and EWS algorithms referred to as SES and SEWS. We provide comprehensive analyses of the theoretical bounds and complexities of our proposed algorithms. Finally, we perform extensive experimental evaluations of our proposed algorithms on several real-world temporal graphs. The results show that ES and EWS have higher efficiency, better accuracy, and greater scalability than state-of-the-art sampling methods for temporal motif counting in the offline setting. Moreover, SES and SEWS achieve up to three orders of magnitude speedups over ES and EWS while having comparable estimation errors for temporal motif counting in the streaming setting.
翻译:从通信网络中的用户互动到金融市场交易,多种多样的复杂系统,从通信网络中的用户互动到金融市场的交易,都可以模拟成由一组脊椎组成的时间图,以及一系列时间印记和定向边缘组成的时间图。Temalmotif从静态图中的子图模式中普遍化。静态图中考虑到边缘顺序和期限,除地形外,还考虑到边缘顺序和期限。计算时间图分析的一个根本问题。但是,现有的方法要么不能支持时间偏移,要么就受到性能问题的影响。此外,它们无法在流动模型中工作,该模型的边际被逐渐观察到。在本文件中,我们侧重于通过随机采样来计算大约的时间模型。我们首先建议用两种抽样算法来计算时间标定时间偏移的顺序和时间长度。我们用直径偏移的SE-SE-Serfalift 算出两种边际算法,我们用SE-SES-Serma 算出两个边际算法,我们用Serima-s 算算算算算出两个边、Serial-al-al-al-salaltras,我们用Sex算算算算算算算算算出两个。我们用Se-sal-mox算算出两个Se-mox。