Relationship between agents can be conveniently represented by graphs. When these relationships have different modalities, they are better modelled by multilayer graphs where each layer is associated with one modality. Such graphs arise naturally in many contexts including biological and social networks. Clustering is a fundamental problem in network analysis where the goal is to regroup nodes with similar connectivity profiles. In the past decade, various clustering methods have been extended from the unilayer setting to multilayer graphs in order to incorporate the information provided by each layer. While most existing works assume - rather restrictively - that all layers share the same set of nodes, we propose a new framework that allows for layers to be defined on different sets of nodes. In particular, the nodes not recorded in a layer are treated as missing. Within this paradigm, we investigate several generalizations of well-known clustering methods in the complete setting to the incomplete one and prove some consistency results under the Multi-Layer Stochastic Block Model assumption. Our theoretical results are complemented by thorough numerical comparisons between our proposed algorithms on synthetic data, and also on real datasets, thus highlighting the promising behaviour of our methods in various settings.
翻译:代理商之间的关系可以用图表来方便地代表。 当这些关系有不同的方式时, 它们会以多层图为模型, 每一层都与同一模式相联系。 这种图表自然地在很多情况下出现, 包括生物和社会网络。 集群是网络分析中的一个基本问题, 目的是将节点与类似的连接剖面重新组合。 在过去的十年中, 各种集群方法从单层设置扩大到多层图, 以便纳入各层提供的信息。 虽然大多数现有工作假设 -- -- 相对严格地 -- -- 所有的层都共用相同的节点组, 我们提议了一个新的框架, 允许在不同节点组上确定层次。 特别是, 未记录在某个节点上的节点被视为缺失。 在这个模式中, 我们调查了完整设置中众所周知的集群方法的几处概括性, 并证明在多层图案模型假设下取得的一些一致性结果。 我们的理论结果得到补充, 我们提议的合成数据算法和真实数据集之间的彻底的数值比较, 从而突出我们在不同环境中有希望的方法。