重新审视在图表代表性学习中不同性别现象的作用:边缘分类观点 (Revisiting the role of heterophily in graph representation learning: An edge classification perspective)

Graph representation learning aim at integrating node contents with graph structure to learn nodes/graph representations. Nevertheless, it is found that many existing graph learning methods do not work well on data with high heterophily level that accounts for a large proportion of edges between different class labels. Recent efforts to this problem focus on improving the message passing mechanism. However, it remains unclear whether heterophily truly does harm to the performance of graph neural networks (GNNs). The key is to unfold the relationship between a node and its immediate neighbors, e.g., are they heterophilous or homophilious? From this perspective, here we study the role of heterophily in graph representation learning before/after the relationships between connected nodes are disclosed. In particular, we propose an end-to-end framework that both learns the type of edges (i.e., heterophilous/homophilious) and leverage edge type information to improve the expressiveness of graph neural networks. We implement this framework in two different ways. Specifically, to avoid messages passing through heterophilous edges, we can optimize the graph structure to be homophilious by dropping heterophilous edges identified by an edge classifier. Alternatively, it is possible to exploit the information about the presence of heterophilous neighbors for feature learning, so a hybrid message passing approach is devised to aggregate homophilious neighbors and diversify heterophilous neighbors based on edge classification. Extensive experiments demonstrate the remarkable performance improvement of GNNs with the proposed framework on multiple datasets across the full spectrum of homophily level.

翻译：然而,人们发现,许多现有的图表学习方法在高偏差水平的数据上效果不佳,这些数据占不同类标签间边缘的很大比例。最近针对这一问题的努力侧重于改进信息传递机制。然而,目前还不清楚的是,偏差性地真正对图形神经网络(GNNs)的性能造成危害。关键在于展示节点与其近邻之间的关系,例如,它们是异性性还是同性性?从这个角度来看,我们研究在显示连接节点之间的关系之前/之后,在图表表达方法中,不同程度不同,在图表表达方法中,不同程度不同。特别是,我们提出一个端对端框架,既了解图像类型(例如,异性嗜好/嗜好),又利用优势信息来改善图形神经网络的外观性能。我们以两种不同方式实施这一框架。具体地,通过在显赫性边边缘上传递信息,我们通过将信息通过高性极性流流流流流流流流流流流流流学,可以优化Glickrill 结构结构结构显示,通过他势的流流流流信息显示,他势的上,他势的流流流流流流流流信息显示,他势至可能的流流流信息显示他势。