We study the problem of graph structure identification, i.e., of recovering the graph of dependencies among time series. We model these time series data as components of the state of linear stochastic networked dynamical systems. We assume partial observability, where the state evolution of only a subset of nodes comprising the network is observed. We devise a new feature vector computed from the observed time series and prove that these features are linearly separable, i.e., there exists a hyperplane that separates the cluster of features associated with connected pairs of nodes from those associated with disconnected pairs. This renders the features amenable to train a variety of classifiers to perform causal inference. In particular, we use these features to train Convolutional Neural Networks (CNNs). The resulting causal inference mechanism outperforms state-of-the-art counterparts w.r.t. sample-complexity. The trained CNNs generalize well over structurally distinct networks (dense or sparse) and noise-level profiles. Remarkably, they also generalize well to real-world networks while trained over a synthetic network (realization of a random graph). Finally, the proposed method consistently reconstructs the graph in a pairwise manner, that is, by deciding if an edge or arrow is present or absent in each pair of nodes, from the corresponding time series of each pair. This fits the framework of large-scale systems, where observation or processing of all nodes in the network is prohibitive.
翻译:我们研究图表结构识别问题, 即恢复时间序列间依赖关系图的问题。 我们将这些时间序列数据作为线性随机网络动态系统状态的组成部分进行模型化。 我们假设部分可观察性, 观察网络中仅一组节点的状态演变。 我们设计了一个新的特性矢量, 从所观测的时间序列中计算出来, 证明这些特征是线性可分离的, 即, 存在一个超大机, 将连接的对接节点的特征组合与断开的对配点相关联。 这使得这些时间序列数据可以用来训练各种分类人员进行因果关系推断。 特别是, 我们使用这些特性来训练进化神经网络( CNNs) 。 由此产生的因果关系机制超越了所观测的时间序列的状态, 并证明这些特征是线性, 即经过训练的网络结构上的不同网络( 密度或稀疏) 和噪音级别剖析的组合组合。 值得注意的是, 它们还可以在现实世界网络中进行概括化, 并且通过一个连续的合成网络的直径直径分析, 和直径的直径直径直径直径直的网络, 。