Scene graph generation (SGG) aims to detect objects and predict their pairwise relationships within an image. Current SGG methods typically utilize graph neural networks (GNNs) to acquire context information between objects/relationships. Despite their effectiveness, however, current SGG methods only assume scene graph homophily while ignoring heterophily. Accordingly, in this paper, we propose a novel Heterophily Learning Network (HL-Net) to comprehensively explore the homophily and heterophily between objects/relationships in scene graphs. More specifically, HL-Net comprises the following 1) an adaptive reweighting transformer module, which adaptively integrates the information from different layers to exploit both the heterophily and homophily in objects; 2) a relationship feature propagation module that efficiently explores the connections between relationships by considering heterophily in order to refine the relationship representation; 3) a heterophily-aware message-passing scheme to further distinguish the heterophily and homophily between objects/relationships, thereby facilitating improved message passing in graphs. We conducted extensive experiments on two public datasets: Visual Genome (VG) and Open Images (OI). The experimental results demonstrate the superiority of our proposed HL-Net over existing state-of-the-art approaches. In more detail, HL-Net outperforms the second-best competitors by 2.1$\%$ on the VG dataset for scene graph classification and 1.2$\%$ on the IO dataset for the final score. Code is available at https://github.com/siml3/HL-Net.
翻译:屏幕图形生成( SGG) 的目的是在图像中检测对象并预测它们的对称关系 。 当前 SGG 方法通常使用图形神经网络( GNNs) 获取对象/ 关系之间的背景信息 。 然而, 尽管其有效性, 目前 SG 方法只假设场景图形, 却忽略偏差。 因此, 在本文中, 我们提出一个新的 Heteropyly 学习网络( HL- Net), 以全面探索场景图形中对象/ 关系之间的同质和异性关系 。 更具体地说, HL- Net 方法包括以下 1 :1 个适应性再加权变异器模块, 以适应性地整合不同层的信息, 以在对象中利用异端和同质的双层信息 ; 2 关系传播模块, 有效地探索关系间的联系, 以考虑偏差的方式改进关系表示; 3 一种偏差的电文通电文流通通通通通式信息传递计划, 进一步区分目标/ 级分类/ 共和正态 。 从而改进信息在图表中传递数据传输 $ 。 我们进行了广泛的实验性GLLLLL- 。 在两个公开的图像数据格式上进行广泛的实验。