Molecular machine learning (ML) has proven important for tackling various molecular problems, including the prediction of protein-drug interactions and blood brain-barrier permeability. Since relatively recently, so-called graph neural networks (GNNs) have been implemented for molecular ML, showing comparable or superior performance to descriptor-based approaches. Although various tools and packages exist to apply GNNs for molecular ML, a new GNN package, named MolGraph (https://github.com/akensert/molgraph), was developed in this work with the motivation to create GNNs highly compatible with the TensorFlow and Keras application programming interface (API). As MolGraph focuses specifically and exclusively on molecular ML, a chemistry module was implemented to accommodate the generation of molecular graphs $\unicode{x2014}$ which could then be inputted to the GNNs for molecular ML. To validate the GNNs, they were benchmarked against the datasets of MoleculeNet, as well as three chromatographic retention time datasets. The results on these benchmarks show that the GNNs performed as expected. Additionally, the GNNs proved useful for molecular identification and improved interpretability of chromatographic retention data.
翻译:分子机学习(ML)已证明对解决各种分子问题十分重要,包括预测蛋白-药物相互作用和血液大脑阻塞性渗透性;自较近期以来,对分子神经网络(GNNS)实施了所谓的图形神经网络(GNNS),显示分子ML具有可比较性或优于描述性的方法;虽然有各种工具和软件包将GNNS应用于分子ML,但在这个工作中开发了一个名为MolGraph(https://github.com/akenert/molgraph)的新的GNNS(https://gthub.com/akenert/molgraph)的软件包,其动机是创建与TensorFlow和Keras应用程序接口高度兼容的GNNNNS(API)。由于MGGPF具体和专门侧重于分子ML,因此采用了一个化学模块来适应分子图的生成 $\uncode{x2014},然后可以输入到分子MNNNS。为了验证GNNNN,它们以Ms 数据库数据集的数据集为基准,它们为基准,它们作为基准,并且证明了GNNNS的改进了这些数据库的留存性。