Molecular machine learning (ML) has proven important for tackling various molecular problems, including the prediction of protein-drug interactions and blood brain-barrier permeability. Since relatively recently, so-called graph neural networks (GNNs) have been implemented for molecular ML, showing comparable or superior performance to descriptor-based approaches. Although various tools and packages exist to apply GNNs for molecular ML, a new GNN package, named MolGraph, was developed in this work with the motivation to create GNNs highly compatible with the TensorFlow and Keras application programming interface (API). As MolGraph focuses specifically and exclusively on molecular ML, a chemistry module was implemented to accommodate the generation of small molecular graphs $\unicode{x2014}$ which could then be inputted to the GNNs for molecular ML. To validate the GNNs, they were benchmarked against the datasets of MoleculeNet, as well as three chromatographic retention time datasets. The results on these benchmarks show that the GNNs performed as expected. Additionally, the GNNs proved useful for molecular identification and improved interpretability of chromatographic retention time data. MolGraph is available at https://github.com/akensert/molgraph.
翻译:分子机学习(ML)已证明对解决各种分子问题非常重要,包括预测蛋白-药物相互作用和血液大脑阻塞性血液渗透;自最近以来,对分子ML实施了所谓的石墨神经网络(GNNS),显示其性能与描述性方法相似或优于描述性的方法;尽管在分子ML应用GNNS方面存在着各种工具和软件包,以应用GNNS, 名为MolGraph的新的GNNS, 其动机是创建与TensorFlow和Keras应用程序化界面高度兼容的GNNS。由于Mol Graph具体和专门侧重于分子ML,因此采用了一个化学模块,以适应小分子图的生成,$\uncode{x2014},然后可以输入到GNNNS, 用于分子ML。为了验证GNNNS,它们以M的数据集为基准,以及三个色谱保存时间数据集为基准。这些基准显示,GNNNP将具体和专门侧重于分子ML,GNP/Mrmatogramealmatographal可被用于解释。