Homophily is a graph property describing the tendency of edges to connect similar nodes; the opposite is called heterophily. It is often believed that heterophilous graphs are challenging for standard message-passing graph neural networks (GNNs), and much effort has been put into developing efficient methods for this setting. However, there is no universally agreed-upon measure of homophily in the literature. In this work, we show that commonly used homophily measures have critical drawbacks preventing the comparison of homophily levels across different datasets. For this, we formalize desirable properties for a proper homophily measure and verify which measures satisfy which properties. In particular, we show that a measure that we call adjusted homophily satisfies more desirable properties than other popular homophily measures while being rarely used in graph learning literature. Then, we go beyond the homophily-heterophily dichotomy and propose a new characteristic allowing one to further distinguish different sorts of heterophily. The proposed label informativeness (LI) characterizes how much information a neighbor's label provides about a node's label. We analyze LI via the same theoretical framework and show that it is comparable across different datasets. We also observe empirically that LI better agrees with GNN performance compared to homophily measures, which confirms that it is a useful characteristic of the graph structure.
翻译:等同是一个图形属性, 描述连接类似节点的边缘趋势; 反之则称为偏差。 人们常常认为, 异统图形对于标准的信息传递图形神经网络( GNNS) 来说具有挑战性, 并且已经为此设置了高效的方法。 然而, 文献中没有普遍同意的同质测量。 在这项工作中, 我们显示, 常用的同质测量方法有重大缺陷, 无法在不同数据集之间比较同质水平。 为此, 我们正式确定适当同质测量和核实哪些测量方法满足了哪些属性的可取属性。 特别是, 我们显示, 我们称调整过的同质图形图形神经网络( GNNS) 比其他常用的同质测量方法更符合理想属性。 但是, 我们很少在图形学习文献中使用同样的同质测量方法。 然后, 我们超越了同义的对立式对立式的对立度, 并提出了一个新的特征, 允许人们进一步区分不同种类的偏差。 提议的标签( LI) 描述邻居标签能提供多少信息, 说明一个节点的标签。 我们通过不同的标签, 我们通过不同的理论框架来分析一个比性模型的特征, 也同意一个不同的图形。