We propose a Molecular Hypergraph Convolutional Network (MolHGCN) that predicts the molecular properties of a molecule using the atom and functional group information as inputs. Molecules can contain many types of functional groups, which will affect the properties the molecules. For example, the toxicity of a molecule is associated with toxicophores, such as nitroaromatic groups and thiourea. Conventional graph-based methods that consider the pair-wise interactions between nodes are inefficient in expressing the complex relationship between multiple nodes in a graph flexibly, and applying multi-hops may result in oversmoothing and overfitting problems. Hence, we propose MolHGCN to capture the substructural difference between molecules using the atom and functional group information. MolHGCN constructs a hypergraph representation of a molecule using functional group information from the input SMILES strings, extracts hidden representation using a two-stage message passing process (atom and functional group message passing), and predicts the properties of the molecules using the extracted hidden representation. We evaluate the performance of our model using Tox21, ClinTox, SIDER, BBBP, BACE, ESOL, FreeSolv and Lipophilicity datasets. We show that our model is able to outperform other baseline methods for most of the datasets. We particularly show that incorporating functional group information along with atom information results in better separability in the latent space, thus increasing the prediction accuracy of the molecule property prediction.
翻译:我们建议一个分子超强传动网络(MolHGCN), 以原子和功能组信息作为投入来预测分子的分子特性。 分子可以包含许多类型的功能组, 这会影响分子的特性。 例如, 分子的毒性与毒物色有关, 比如硝酸盐类和硫尿素。 常规图形基方法, 认为节点之间的双向互动在用图表灵活表达多个节点之间的复杂关系方面效率低下, 应用多点点可能会导致过度移动和过度适应问题。 因此, 我们建议分子分子ColHGCN 使用原子和功能组信息来捕捉分子之间的亚结构差异。 MoleHGCN 使用从输入的 硝酸盐基和硫尿素链的功能组信息, 利用双级信息传递过程( 解剖和功能组信息传递) 提取隐藏的表示方式, 并使用提取的隐藏的表示方式预测分子特性。 我们用模型评估模型的性能表现, 特别是用托克斯- 21, Clin- 和功能组信息的精确性, 显示我们Sevoliblex 数据 显示其他的精确性 数据, 显示我们在 数据 数据 。