With the wide-spread availability of complex relational data, semi-supervised node classification in graphs has become a central machine learning problem. Graph neural networks are a recent class of easy-to-train and accurate methods for this problem that map the features in the neighborhood of a node to its label, but they ignore label correlation during inference and their predictions are difficult to interpret. On the other hand, collective classification is a traditional approach based on interpretable graphical models that explicitly model label correlations. Here, we introduce a model that combines the advantages of these two approaches, where we compute the marginal probabilities in a conditional random field, similar to collective classification, and the potentials in the random field are learned through end-to-end training, akin to graph neural networks. In our model, potentials on each node only depend on that node's features, and edge potentials are learned via a coupling matrix. This structure enables simple training with interpretable parameters, scales to large networks, naturally incorporates training labels at inference, and is often more accurate than related approaches. Our approach can be viewed as either an interpretable message-passing graph neural network or a collective classification method with higher capacity and modernized training.
翻译:随着复杂关系数据的广泛可得性,图形中的半监督节点分类已成为一个中央机器学习问题。图形神经网络是最近一个容易接受训练的类别,也是这个问题的精确方法,它映射了一个节点附近与标签相邻的特征,但是它们忽略了在推断和预测过程中标签的相关性。另一方面,集体分类是一种传统方法,它基于可解释的图形模型,明确标注关联性。在这里,我们引入了一种模型,将这两种方法的优势结合起来,我们计算出在有条件随机字段中的边际概率,类似于集体分类,随机领域的潜力是通过端到端培训学习的,类似于图形神经网络。在我们模型中,每个节点的潜力只取决于节点的特征,而边缘潜力则通过组合矩阵学习。这一结构可以进行简单的培训,使用可解释参数、尺度到大网络,自然结合培训标签,而且往往比相关方法更准确。我们的方法可以用一个可解释的网络和高度的方法来解读。我们的方法可以被理解为一种可解释的网络,也可以被理解为一种可解释的信息。