The amount and variety of data is increasing drastically for several years. These data are often represented as networks, which are then explored with approaches arising from network theory. Recent years have witnessed the extension of network exploration methods to leverage more complex and richer network frameworks. Random walks, for instance, have been extended to explore multilayer networks. However, current random walk approaches are limited in the combination and heterogeneity of network layers they can handle. New analytical and numerical random walk methods are needed to cope with the increasing diversity and complexity of multilayer networks. We propose here MultiXrank, a Python package that enables Random Walk with Restart (RWR) on any kind of multilayer network with an optimized implementation. This package is supported by a universal mathematical formulation of the RWR. We evaluated MultiXrank with leave-one-out cross-validation and link prediction, and introduced protocols to measure the impact of the addition or removal of multilayer network data on prediction performances. We further measured the sensitivity of MultiXrank to input parameters by in-depth exploration of the parameter space. Finally, we illustrate the versatility of MultiXrank with different use-cases of unsupervised node prioritization and supervised classification in the context of human genetic diseases.
翻译:多年来,数据的数量和种类正在急剧增加。这些数据通常以网络形式出现,然后以网络理论的方法加以探讨。近年来,网络探索方法的扩展扩大了网络探索方法,以利用更复杂和更丰富的网络框架。例如,随机行走扩展了以探索多层网络。然而,目前随机行走方法在它们能够处理的网络层的组合和异质性方面是有限的。需要新的分析和数字随机行走方法来应对多层网络的日益多样化和复杂性。我们在这里提出了多Xrank(MultiXrank)方案,这是一个能使任何类型的多层网络的随机行走和再启动(RWERRRR)得以优化执行的Python软件包。这个软件包得到了RWR的通用数学配方的支持。我们用离子的交叉校验和链接预测对多层网络数据的增加或删除对预测性能的影响进行了评估。我们通过深入探索参数空间,进一步测量了多层网络对输入参数的敏感度。最后,我们说明了多层Xrank(MyXrankek)与不同用途的遗传病的多层的多级和不监督性,没有监督的基因分类。