Network embedding has become a hot research topic recently which can provide low-dimensional feature representations for many machine learning applications. Current work focuses on either (1) whether the embedding is designed as an unsupervised learning task by explicitly preserving the structural connectivity in the network, or (2) whether the embedding is a by-product during the supervised learning of a specific discriminative task in a deep neural network. In this paper, we focus on bridging the gap of the two lines of the research. We propose to adapt the Generative Adversarial model to perform network embedding, in which the generator is trying to generate vertex pairs, while the discriminator tries to distinguish the generated vertex pairs from real connections (edges) in the network. Wasserstein-1 distance is adopted to train the generator to gain better stability. We develop three variations of models, including GANE which applies cosine similarity, GANE-O1 which preserves the first-order proximity, and GANE-O2 which tries to preserves the second-order proximity of the network in the low-dimensional embedded vector space. We later prove that GANE-O2 has the same objective function as GANE-O1 when negative sampling is applied to simplify the training process in GANE-O2. Experiments with real-world network datasets demonstrate that our models constantly outperform state-of-the-art solutions with significant improvements on precision in link prediction, as well as on visualizations and accuracy in clustering tasks.
翻译:最近,网络嵌入已成为一个热点研究课题,可为许多机器学习应用提供低维特征显示。当前工作的重点是:(1) 嵌入是设计成一个未经监督的学习任务,明确维护网络的结构连接,还是(2) 嵌入是监督地学习深层神经网络中特定歧视任务过程中的副产品。在本文中,我们侧重于缩小研究两行之间的距离。我们提议调整“基因反反向模型”,以进行网络嵌入,其中发电机试图生成顶端对配,而歧视者试图将生成的顶端对配与网络中的实际连接(边缘)区分开来。Wasserstein-1距离被用来培训发电机,以获得更好的稳定性。我们开发了三种模型的变异式,包括应用直线相似的GANE、保持一级距离的GANE-O1和试图保持网络在低维度嵌入矢量空间中保持网络第二阶级近距离的精确度连接度,同时,我们后来证明GANE-O2在模拟中与GANE-O2的模拟模型中将常规数据简化了我们的实际数据模式。