Rationalizing which parts of a molecule drive the predictions of a molecular graph convolutional neural network (GCNN) can be difficult. To help, we propose two simple regularization techniques to apply during the training of GCNNs: Batch Representation Orthonormalization (BRO) and Gini regularization. BRO, inspired by molecular orbital theory, encourages graph convolution operations to generate orthonormal node embeddings. Gini regularization is applied to the weights of the output layer and constrains the number of dimensions the model can use to make predictions. We show that Gini and BRO regularization can improve the accuracy of state-of-the-art GCNN attribution methods on artificial benchmark datasets. In a real-world setting, we demonstrate that medicinal chemists significantly prefer explanations extracted from regularized models. While we only study these regularizers in the context of GCNNs, both can be applied to other types of neural networks
翻译:将分子的哪个部分推导分子图形进化神经网络(GCNN)的预测可能很困难。为了帮助我们,我们建议了两种简单的正规化技术,在GCNN(GCNN)培训期间应用:批量代表Orthomanization(BRO)和Gini正规化(Gini) 。BRO在分子轨道理论的启发下,鼓励图变操作生成正态节点嵌入。 Gini 正规化适用于输出层的重量,并限制模型可用于预测的维度数量。我们表明,Gini 和BRO正规化可以提高人造基准数据集方面最先进的GCNN属性方法的准确性。在现实世界环境中,我们证明药用化学家非常喜欢从正规化模型中提取的解释。虽然我们只研究GCNN(GCNN)中的这些规范者,但两者都可以应用于其他类型的神经网络。