Heterogeneous Information Network (HIN) embedding refers to the low-dimensional projections of the HIN nodes that preserve the HIN structure and semantics. HIN embedding has emerged as a promising research field for network analysis as it enables downstream tasks such as clustering and node classification. In this work, we propose \ours for joint learning of cluster embeddings as well as cluster-aware HIN embedding. We assume that the connected nodes are highly likely to fall in the same cluster, and adopt a variational approach to preserve the information in the pairwise relations in a cluster-aware manner. In addition, we deploy contrastive modules to simultaneously utilize the information in multiple meta-paths, thereby alleviating the meta-path selection problem - a challenge faced by many of the famous HIN embedding approaches. The HIN embedding, thus learned, not only improves the clustering performance but also preserves pairwise proximity as well as the high-order HIN structure. We show the effectiveness of our approach by comparing it with many competitive baselines on three real-world datasets on clustering and downstream node classification.
翻译:嵌入的异质信息网络(HIN)是指对保护 HIN 结构和语义学的HIN 节点的低维预测。 HIN 嵌入已成为一个有希望的网络分析研究领域,因为它能够完成集群和节点分类等下游任务。在这项工作中,我们建议为联合学习集群嵌入和集群认知HIN嵌入而联合学习。我们假设连接的节点极有可能在同一组内下降,并采取一种变通办法,以集束意识的方式保存对称关系中的信息。此外,我们采用对比式模块,同时利用多个元路径的信息,从而缓解元路径选择问题,这是许多著名的HIN嵌入方法面临的挑战。因此,HIN嵌入不仅提高了集群的性能,而且还保持了双向接近以及高排序HIN结构。我们通过将其与三个关于集群和下游节点分类的现实世界数据集的许多竞争性基线进行比较,显示了我们的方法的有效性。