Many networks have event-driven dynamics (such as communication, social media and criminal networks), where the mean rate of the events occurring at a node in the network changes according to the occurrence of other events in the network. In particular, events associated with a node of the network could increase the rate of events at other nodes, depending on their influence relationship. Thus, it is of interest to use temporal data to uncover the directional, time-dependent, influence structure of a given network while also quantifying uncertainty even when knowledge of a physical network is lacking. Typically, methods for inferring the influence structure in networks require knowledge of a physical network or are only able to infer small network structures. In this paper, we model event-driven dynamics on a network by a multidimensional Hawkes process. We then develop a novel ensemble-based filtering approach for a time-series of count data (i.e., data that provides the number of events per unit time for each node in the network) that not only tracks the influence network structure over time but also approximates the uncertainty via ensemble spread. The method overcomes several deficiencies in existing methods such as existing methods for inferring multidimensional Hawkes processes are too slow to be practical for any network over ~50 nodes, can only deal with timestamp data (i.e. data on just when events occur not the number of events at each node), and that we do not need a physical network to start with. Our method is massively parallelizable, allowing for its use to infer the influence structure of large networks (~10,000 nodes). We demonstrate our method for large networks using both synthetic and real-world email communication data.
翻译:许多网络都有由事件驱动的动态(如通信、社交媒体和犯罪网络), 通常, 在网络节点发生事件的平均速率会随着网络中其他事件的发生而变化。 特别是, 与网络节点相关的事件可能会增加其他节点发生事件的速率, 取决于它们的影响关系。 因此, 使用时间数据来发现特定网络的方向、 时间、 影响结构, 同时也在缺乏物理网络知识的情况下量化不确定性。 通常, 计算网络影响力结构的方法需要了解物理网络, 或者只能根据网络中其他事件的发生情况来推断小网络结构。 在本文中, 我们用多功能的霍克斯进程来模拟一个网络中的事件驱动的动态。 然后, 我们开发一个新的基于全套过滤方法, 用于一个时间序列的数据( 即提供网络每个节点每个单位时间的事件数量的数据) 。 我们不仅跟踪影响网络结构的时间, 而且还通过游戏扩展来估计不确定性。 方法可以克服当前网络中存在的各种错误, 数据处理方法不会在时间里行中出现。 当我们使用大量数据时, 数据处理方法时, 当我们不需要使用任何数字时, 时间 。, 将数据处理方法会显示我们的任何方法 。