Network representation learning seeks to embed networks into a low-dimensional space while preserving the structural and semantic properties, thereby facilitating downstream tasks such as classification, trait prediction, edge identification, and community detection. Motivated by challenges in brain connectivity data analysis that is characterized by subject-specific, high-dimensional, and sparse networks that lack node or edge covariates, we propose a novel contrastive learning-based statistical approach for network edge embedding, which we name as Adaptive Contrastive Edge Representation Learning (ACERL). It builds on two key components: contrastive learning of augmented network pairs, and a data-driven adaptive random masking mechanism. We establish the non-asymptotic error bounds, and show that our method achieves the minimax optimal convergence rate for edge representation learning. We further demonstrate the applicability of the learned representation in multiple downstream tasks, including network classification, important edge detection, and community detection, and establish the corresponding theoretical guarantees. We validate our method through both synthetic data and real brain connectivities studies, and show its competitive performance compared to the baseline method of sparse principal components analysis.
翻译:网络表示学习旨在将网络嵌入到低维空间中,同时保持其结构和语义特性,从而促进分类、特征预测、边识别及社区检测等下游任务。受脑连接数据分析中面临的挑战启发——此类数据通常具有主体特异性、高维且稀疏的网络结构,且缺乏节点或边的协变量信息——我们提出了一种新颖的基于对比学习的网络边嵌入统计方法,命名为自适应对比边表示学习(ACERL)。该方法基于两个核心组件:增强网络对的对比学习,以及数据驱动的自适应随机掩码机制。我们建立了非渐近误差界,并证明该方法在边表示学习中达到了极小极大最优收敛速率。我们进一步展示了所学表示在多个下游任务中的适用性,包括网络分类、重要边检测和社区检测,并建立了相应的理论保证。通过合成数据和真实脑连接研究验证了本方法,结果表明其相较于稀疏主成分分析基线方法具有竞争优势。