Graph Neural Networks share with Logic Programming several key relational inference mechanisms. The datasets on which they are trained and evaluated can be seen as database facts containing ground terms. This makes possible modeling their inference mechanisms with equivalent logic programs, to better understand not just how they propagate information between the entities involved in the machine learning process but also to infer limits on what can be learned from a given dataset and how well that might generalize to unseen test data. This leads us to the key idea of this paper: modeling with the help of a logic program the information flows involved in learning to infer from the link structure of a graph and the information content of its nodes properties of new nodes, given their known connections to nodes with possibly similar properties. The problem is known as graph node property prediction and our approach will consist in emulating with help of a Prolog program the key information propagation steps of a Graph Neural Network's training and inference stages. We test our a approach on the ogbn-arxiv node property inference benchmark. To infer class labels for nodes representing papers in a citation network, we distill the dependency trees of the text associated to each node into directed acyclic graphs that we encode as ground Prolog terms. Together with the set of their references to other papers, they become facts in a database on which we reason with help of a Prolog program that mimics the information propagation in graph neural networks predicting node properties. In the process, we invent ground term similarity relations that help infer labels in the test set by propagating node properties from similar nodes in the training set and we evaluate their effectiveness in comparison with that of the graph's link structure. Finally, we implement explanation generators that unveil performance upper bounds inherent to the dataset. As a practical outcome, we obtain a logic program, that, when seen as machine learning algorithm, performs close to the state of the art on the node property prediction benchmark.
翻译:内建网络 与逻辑编程共享多个关键关系推断机制 。 用于培训和评价的数据集可以被视为包含地面术语的数据库事实 。 这样可以建模它们的推论机制, 并使用等效逻辑程序, 从而更好地了解它们如何在参与机器学习进程的实体之间传播信息, 并推断从给定的数据集中可以学到什么的限度, 以及从中可以概括为隐蔽的测试数据。 这让我们找到本文件的关键参考文件 : 在逻辑程序的帮助下建模用于从图表的链接结构及其新节点的信息内容中进行推导的信息流 。 鉴于它们已知的与可能类似属性的节点的连接, 问题不仅在于它们如何在机器学习进程之间传播信息, 并且用来模拟从某个特定数据集中学到的关键信息传播步骤 。 我们测试我们在Ogbn- 编程断层属性上采用的方法, 帮助从图表的链接结构中进行推导算, 并且通过在网络的轨迹中进行推算, 在网络的轨迹中, 将一个预算中, 预算中, 预算中, 预判的预判中, 预判中, 预判中, 预判中, 预判的预判中, 预判中, 预判的预判的预判中, 预判中, 预判的预判中, 预判。