Can we inject the pocket-ligand interaction knowledge into the pre-trained model and jointly learn their chemical space? Pretraining molecules and proteins has attracted considerable attention in recent years, while most of these approaches focus on learning one of the chemical spaces and lack the injection of biological knowledge. We propose a co-supervised pretraining (CoSP) framework to simultaneously learn 3D pocket and ligand representations. We use a gated geometric message passing layer to model both 3D pockets and ligands, where each node's chemical features, geometric position and orientation are considered. To learn biological meaningful embeddings, we inject the pocket-ligand interaction knowledge into the pretraining model via contrastive loss. Considering the specificity of molecules, we further propose a chemical similarity-enhanced negative sampling strategy to improve the contrastive learning performance. Through extensive experiments, we conclude that CoSP can achieve competitive results in pocket matching, molecule property predictions, and virtual screening.
翻译:我们能否将袖珍和互动知识注入经过培训的模型中,并共同学习化学空间?近年来,预先培养分子和蛋白质引起了相当大的关注,而大多数这些方法侧重于学习一个化学空间,缺乏生物知识的注入。我们提议了一个共同监督的训练前框架(COSP),以同时学习 3D 口袋和liganand 演示。我们用一个封闭的几何信息传递层来模拟 3D 口袋和ligand, 其中考虑每个节点的化学特征、几何位置和方向。为了学习有意义的生物嵌入,我们通过对比性损失将袖珍图和互动知识注入培训前模式。考虑到分子的特殊性,我们进一步提议了一个化学相似性强化负面抽样战略,以改进对比性学习表现。我们通过广泛的实验得出结论, CoSP可以在口袋匹配、分子属性预测和虚拟筛选中取得竞争性的结果。