Graph neural networks (GNN) are very popular methods in machine learning and have been applied very successfully to the prediction of the properties of molecules and materials. First-order GNNs are well known to be incomplete, i.e., there exist graphs that are distinct but appear identical when seen through the lens of the GNN. More complicated schemes have thus been designed to increase their resolving power. Applications to molecules (and more generally, point clouds), however, add a geometric dimension to the problem. The most straightforward and prevalent approach to construct graph representation for molecules regards atoms as vertices in a graph and draws a bond between each pair of atoms within a chosen cutoff. Bonds can be decorated with the distance between atoms, and the resulting "distance graph NNs" (dGNN) have empirically demonstrated excellent resolving power and are widely used in chemical ML, with all known indistinguishable configurations being resolved in the fully-connected limit, which is equivalent to infinite or sufficiently large cutoff. Here we present a counterexample that proves that dGNNs are not complete even for the restricted case of fully-connected graphs induced by 3D atom clouds. We construct pairs of distinct point clouds whose associated graphs are, for any cutoff radius, equivalent based on a first-order Weisfeiler-Lehman test. This class of degenerate structures includes chemically-plausible configurations, both for isolated structures and for infinite structures that are periodic in 1, 2, and 3 dimensions. The existence of indistinguishable configurations sets an ultimate limit to the expressive power of some of the well-established GNN architectures for atomistic machine learning. Models that explicitly use angular or directional information in the description of atomic environments can resolve this class of degeneracies.
翻译:图像神经网络( GNN) 是机器学习中非常流行的方法, 并被非常成功地应用到分子和材料特性的预测中。 第一级GNNN是众所周知的不完整的, 也就是说, 现有图表通过 GNN 的镜头看出来, 看上去与众不同, 但看起来与众不同。 因此, 设计更复杂的方案是为了增加其解析力。 对分子的应用( 更一般而言, 点云) 增加了一个几何维度问题。 最直接和最普遍的方法, 用来构建分子以原子为悬崖的图形表达方式, 并在所选的截断处中绘制每对一对原子的直线结构之间的连接。 邦德可以与原子之间的距离进行分解。 邦德( GNNNN) 和由此产生的“ 远端图” 模型已经以实验方式展示出很好的解析力, 化学ML 广泛使用所有已知的易分解配置, 这在完全连结的界限中, 相当于无限或足够大的断开。 我们在这里展示一个反向的反向的表示, 在等的平层结构中, 其直径结构中, 直径结构中, 直径的根的根的直径的根的根的根的根的根的GNNNNNNNNNND结构在直系的直系的直系的直系的直系结构在直系的直系的直系的直系结构在直系,,, 直系, 直系的直系的直系的直系, 直系的直系的直系的直系的直系的直系的直系的直系的直系的直系, 直系的直系的直系的直系的直系的直系的直系的直系的直系的直系, 。