图表过滤内核 (Graph Filtration Kernels)

The majority of popular graph kernels is based on the concept of Haussler's $\mathcal{R}$-convolution kernel and defines graph similarities in terms of mutual substructures. In this work, we enrich these similarity measures by considering graph filtrations: Using meaningful orders on the set of edges, which allow to construct a sequence of nested graphs, we can consider a graph at multiple granularities. For one thing, this provides access to features on different levels of resolution. Furthermore, rather than to simply compare frequencies of features in graphs, it allows for their comparison in terms of when and for how long they exist in the sequences. In this work, we propose a family of graph kernels that incorporate these existence intervals of features. While our approach can be applied to arbitrary graph features, we particularly highlight Weisfeiler-Lehman vertex labels, leading to efficient kernels. We show that using Weisfeiler-Lehman labels over certain filtrations strictly increases the expressive power over the ordinary Weisfeiler-Lehman procedure in terms of deciding graph isomorphism. In fact, this result directly yields more powerful graph kernels based on such features and has implications to graph neural networks due to their close relationship to the Weisfeiler-Lehman method. We empirically validate the expressive power of our graph kernels and show significant improvements over state-of-the-art graph kernels in terms of predictive performance on various real-world benchmark datasets.

翻译：流行的图形内核大多基于豪斯勒 $\ mathcal{R} $ convolution 内核的概念, 并定义了在相互的子结构中的图形相似性。在这项工作中, 我们通过考虑图形过滤来丰富这些相似性量度: 使用一组边缘上有意义的命令, 以构建嵌入式图形序列, 我们可以在多颗颗颗粒上考虑一个图形。对于一件事, 这提供了在不同分辨率层次上访问特征的图。此外, 与其简单地比较图表中特征的频率, 还可以比较这些特征在序列中存在的时间和时间的长度。在这项工作中, 我们建议了一组包含这些特征的图形内核元素。虽然我们的方法可以适用于任意的图形特性, 我们特别强调 Weisfeiler- Leman 的顶部标签, 导致高效的内核内核。我们显示, Weisfeiler- Leman 在某些过滤中标有不同程度的图形上的改进性能, 严格地提高了普通 Weisfefeler- stal 的直径的直径直径直径直径图像网络上的直径直径直径直线关系。