The Weisfeiler-Lehman graph kernels are among the most prevalent graph kernels due to their remarkable time complexity and predictive performance. Their key concept is based on an implicit comparison of neighborhood representing trees with respect to equality (i.e., isomorphism). This binary valued comparison is, however, arguably too rigid for defining suitable similarity measures over graphs. To overcome this limitation, we propose a generalization of Weisfeiler-Lehman graph kernels which takes into account the similarity between trees rather than equality. We achieve this using a specifically fitted variation of the well-known tree edit distance which can efficiently be calculated. We empirically show that our approach significantly outperforms state-of-the-art methods in terms of predictive performance on datasets containing structurally more complex graphs beyond the typically considered molecular graphs.
翻译:Weisfeiler-Lehman 图形内核是最流行的图形内核之一,因为它们具有非凡的时间复杂性和预测性能。它们的关键概念是基于对代表树木的附近地区在平等方面的隐性比较(即形态论 ) 。 然而,这种二进制价值的比较可以说过于僵硬,无法界定与图形相近的适当度量度。为了克服这一限制,我们建议对Weisfeiler-Lehman 图形内核进行概括化,其中考虑到树木之间的相似性而不是平等性。我们利用特别适合的、可以有效计算的已知的树编辑距离来实现这一目标。我们从经验上表明,我们的方法在包含结构上更为复杂的图形的数据集的预测性性能方面,大大超过一般认为的分子图谱之外的最先进的方法。