Computational discovery of ideal lead compounds is a critical process for modern drug discovery. It comprises multiple stages: hit screening, molecular property prediction, and molecule optimization. Current efforts are disparate, involving the establishment of models for each stage, followed by multi-stage multi-model integration. However, this is non-ideal, as clumsy integration of incompatible models increases research overheads, and may even reduce success rates in drug discovery. Facilitating compatibilities requires establishing inherent model consistencies across lead discovery stages. Towards that effect, we propose an open deep graph learning (DGL) based pipeline: generative adversarial feature subspace enhancement (GAFSE), which first unifies the modeling of these stages into one learning framework. GAFSE also offers standardized modular design and streamlined interfaces for future expansions and community support. GAFSE combines adversarial/generative learning, graph attention network, graph reconstruction network, and optimizes the classification/regression loss, adversarial/generative loss, and reconstruction loss simultaneously. Convergence analysis theoretically guarantees model generalization performance. Exhaustive benchmarking demonstrates that the GAFSE pipeline achieves excellent performance across almost all lead discovery stages, while also providing valuable model interpretability. Hence, we believe this tool will enhance the efficiency and productivity of drug discovery researchers.
翻译:理想铅化合物的计算发现是现代药物发现的关键过程,它包括多个阶段:撞击筛选、分子属性预测和分子优化。目前的努力是不同的,涉及为每个阶段建立模型,然后是多阶段多模式整合。然而,这是非理想的,因为不相容模型的笨拙整合增加了研究间接费用,甚至可能降低药物发现的成功率。促进兼容性要求在所有铅发现阶段建立内在的模型。为此,我们提议以开放的深图学习(DGL)为基础的管道:基因式对抗特征子空间增强(GAFSE),首先将这些阶段的模型化成一个学习框架。GAFSE还提供标准化的模块设计和简化界面,用于未来的扩展和社区支持。GAFSE将对抗性/遗传性学习、图表关注网络、图表重建网络结合起来,并优化分类/倒退损失、对抗性/遗传性损失和重建损失。为了达到这一效果,我们建议同时进行基于理论的理论分析,保证典型的对抗性对抗性对抗性特征次级空间增强(DGFSE) 增强这些阶段的模型基准化,同时让我们相信GAFSE的发现效率。