Graph convolutional neural networks (GCNN) are very popular methods in machine learning and have been applied very successfully to the prediction of the properties of molecules and materials. First-order GCNNs are well known to be incomplete, i.e., there exist graphs that are distinct but appear identical when seen through the lens of the GCNN. More complicated schemes have thus been designed to increase their resolving power. Applications to molecules (and more generally, point clouds), however, add a geometric dimension to the problem. The most straightforward and prevalent approach to construct graph representation for the molecules regards atoms as vertices in a graph and draws a bond between each pair of atoms within a certain preselected cutoff. Bonds can be decorated with the distance between atoms, and the resulting "distance graph convolution NNs" (dGCNN) have empirically demonstrated excellent resolving power and are widely used in chemical ML. Here we show that even for the restricted case of graphs induced by 3D atom clouds dGCNNs are not complete. We construct pairs of distinct point clouds that generate graphs that, for any cutoff radius, are equivalent based on a first-order Weisfeiler-Lehman test. This class of degenerate structures includes chemically-plausible configurations, setting an ultimate limit to the expressive power of some of the well-established GCNN architectures for atomistic machine learning. Models that explicitly use angular information in the description of atomic environments can resolve these degeneracies.
翻译:图形凝固神经网络(GCNN)是机器学习中非常流行的方法,并且非常成功地用于预测分子和材料的特性。 第一级GCNNN是众所周知的不完全的,也就是说,有一些不同的图表,但从GCNN的透镜中可以看到,它们看起来是相同的。因此,设计了更为复杂的计划来提高它们的解析力。分子(以及更一般的点云)的应用增加了一个几何层面的问题。在图中,为分子构建图表表示原子为悬崖的图形表示法是最直接和最普遍的方法,在预选的截断点中,在每对原子之间绘制一种粘合。债券可以与原子之间的距离进行分解,而由此产生的“远程图形凝固NNNNN(dGCNN)”则以实验方式展示出很好的解析能力,在化学 ML 中广泛使用。我们在这里显示,即使由 3D 原子云和 DGCNNNNNNs 所引发的图表的限定性案例也是不完整的。我们在某些点上构建了一种清晰的平面的平整的云层结构, 其最终的模型可以用来测定。