Spectral embedding finds vector representations of the nodes of a network, based on the eigenvectors of a properly constructed matrix, and has found applications throughout science and technology. Many networks are multipartite, meaning that they contain nodes of fundamentally different types, e.g. drugs, diseases and proteins, and edges are only observed between nodes of different types. When the network is multipartite, this paper demonstrates that the node representations obtained via spectral embedding lie near type-specific low-dimensional subspaces of a higher-dimensional ambient space. For this reason we propose a follow-on step after spectral embedding, to recover node representations in their intrinsic rather than ambient dimension, proving uniform consistency under a low-rank, inhomogeneous random graph model. We demonstrate the performance of our procedure on a large 6-partite biomedical network relevant for drug discovery.
翻译:谱嵌入基于适当构造的矩阵的特征向量,为网络节点寻找向量表示,已在科学与技术领域得到广泛应用。许多网络属于多部结构,即包含本质不同类型的节点(例如药物、疾病和蛋白质),且边仅在不同类型节点之间出现。本文证明,当网络为多部结构时,通过谱嵌入获得的节点表示位于高维环境空间中特定类型的低维子空间附近。为此,我们提出在谱嵌入后增加一个后续步骤,以在节点的本征维度而非环境维度中恢复其表示,并在低秩非齐次随机图模型下证明了该方法的均匀一致性。我们在一个与药物发现相关的大型六部生物医学网络上验证了所提方法的性能。