When dealing with tabular data, models based on decision trees are a popular choice due to their high accuracy on these data types, their ease of application, and explainability properties. However, when it comes to graph-structured data, it is not clear how to apply them effectively, in a way that incorporates the topological information with the tabular data available on the vertices of the graph. To address this challenge, we introduce Decision Trees with Dynamic Graph Features (TREE-G). Rather than only using the pre-defined given features in the data, TREE-G acts on dynamic features, which are computed as the graph traverses the tree. These dynamic features combine the vertex features with the topological information, as well as the cumulative information learned by the tree. Therefore, the features adapt to the predictive task and the graph in hand. We analyze the theoretical properties of TREE-G and demonstrate its benefits empirically on multiple graph and node prediction benchmarks. In these experiments,TREE-G consistently outperformed other tree-based models and often outperformed other graph-learning algorithms such as Graph Neural Networks (GNNs) and Graph Kernels, sometimes by large margins. Finally, we also provide an explainability mechanism for TREE-G, and demonstrate that it can provide informative and intuitive explanations.
翻译:在处理表格数据时,基于决策树的模型是一种受欢迎的选择,因为其数据类型具有高度精准性、易于应用和可解释性。然而,在图形结构数据方面,尚不清楚如何有效地应用这些模型,将地形信息与图表顶端上的表层数据结合起来。为了应对这一挑战,我们引入具有动态图形特征的决策树(TREE-G),而不是仅仅使用数据中预先界定的特性,TREE-G在动态特征上运行,这些特征作为图表穿刺树来计算。这些动态特征将顶部特征与表层信息以及树上累积的信息结合起来。因此,这些特征适应了图表顶端任务和手头的图表。我们分析了TREEG的理论属性,并以经验方式在多个图表和无偏差的预测基准上展示了它的好处。在这些实验中,TREEEG始终超越了其他树基模型,并且往往超越了其他成型图表学习算法的算法,例如图形神经网络,以及树上累积的信息信息信息。因此,我们也可以在图中解释一个巨大的网络和 KRENLA(GN) 和 KREstal 解释。