GeomGCL: 用于分子特性预测的几何图表对比性学习 (GeomGCL: Geometric Graph Contrastive Learning for Molecular Property Prediction)

Recently many efforts have been devoted to applying graph neural networks (GNNs) to molecular property prediction which is a fundamental task for computational drug and material discovery. One of major obstacles to hinder the successful prediction of molecule property by GNNs is the scarcity of labeled data. Though graph contrastive learning (GCL) methods have achieved extraordinary performance with insufficient labeled data, most focused on designing data augmentation schemes for general graphs. However, the fundamental property of a molecule could be altered with the augmentation method (like random perturbation) on molecular graphs. Whereas, the critical geometric information of molecules remains rarely explored under the current GNN and GCL architectures. To this end, we propose a novel graph contrastive learning method utilizing the geometry of the molecule across 2D and 3D views, which is named GeomGCL. Specifically, we first devise a dual-view geometric message passing network (GeomMPNN) to adaptively leverage the rich information of both 2D and 3D graphs of a molecule. The incorporation of geometric properties at different levels can greatly facilitate the molecular representation learning. Then a novel geometric graph contrastive scheme is designed to make both geometric views collaboratively supervise each other to improve the generalization ability of GeomMPNN. We evaluate GeomGCL on various downstream property prediction tasks via a finetune process. Experimental results on seven real-life molecular datasets demonstrate the effectiveness of our proposed GeomGCL against state-of-the-art baselines.

翻译：最近,许多努力都致力于应用图形神经网络(GNN)来进行分子属性预测,这是计算药物和材料发现的一项基本任务。阻碍GNN公司成功预测分子属性的主要障碍之一是缺乏标签数据。虽然图形对比学习方法取得了非同寻常的性能,但标签数据不足,主要侧重于设计一般图形的数据增强计划。然而,分子的基本特性可以通过分子图中的增强方法(如随机扰动)加以改变。尽管在目前的GNN和GCL基准结构下,分子的关键几何信息仍然很少得到探讨。为此,我们提议采用新型图表对比学习方法,利用2D和3D视图的分子的几何测量方法,称为GeemGCL。具体地,我们首先设计了一个双视地理测量信息传递网络(GeomMPNNNN),以适应性地缘图2D和3D图中的丰富信息。在不同级别上纳入几何属性,可以极大地促进分子分子分子间基的基线代表。随后,我们提出了一个新的图表对比方法,我们用各种地基模型预测能力,我们设计了一种新的地理物理定位模型分析模型分析方法,用来评估其他的地表结果。我们设计了一种对地基地基结果的模型的模型的模型进行对比分析方法。我们对地基数测测地基结果的模型的模型的系统分析方法。我们设计了一种对地基图的模型的模型的模型的模型的模型的精确分析方法。