The ability to compute similarity scores between graphs based on metrics such as Graph Edit Distance (GED) is important in many real-world applications. Computing exact GED values is typically an NP-hard problem and traditional algorithms usually achieve an unsatisfactory trade-off between accuracy and efficiency. Recently, Graph Neural Networks (GNNs) provide a data-driven solution for this task, which is more efficient while maintaining prediction accuracy in small graph (around 10 nodes per graph) similarity computation. Existing GNN-based methods, which either respectively embeds two graphs (lack of low-level cross-graph interactions) or deploy cross-graph interactions for whole graph pairs (redundant and time-consuming), are still not able to achieve competitive results when the number of nodes in graphs increases. In this paper, we focus on similarity computation for large-scale graphs and propose the "embedding-coarsening-matching" framework CoSimGNN, which first embeds and coarsens large graphs with adaptive pooling operation and then deploys fine-grained interactions on the coarsened graphs for final similarity scores. Furthermore, we create several synthetic datasets which provide new benchmarks for graph similarity computation. Detailed experiments on both synthetic and real-world datasets have been conducted and CoSimGNN achieves the best performance while the inference time is at most 1/3 of that of previous state-of-the-art.
翻译:计算基于图形编辑距离(GED)等指标的图表之间相似分数的能力在许多现实世界应用中非常重要。计算精确的GED值通常是一个NP-硬问题,传统算法通常在准确性和效率之间实现不尽的权衡。最近,Gea Neal网络(GNNS)为这项任务提供了一种数据驱动的解决方案,在以小图(约10个节点/每图)的类似计算方法保持预测准确性的同时,效率更高。现有的GNNN(GNN)方法分别包含两个图形(缺乏低水平跨线互动)或为整组图形配对(冗余和耗时)部署跨线互动,但在图形中节点数增加时,仍然无法取得竞争性结果。在本文件中,我们侧重于大型图形的类似计算,并提议“嵌入-剖析-相匹配”框架CSimGNNN,它首先将大图表嵌入和解析,然后将一些精细的交互作用用于整个图形的整形三色图表,而我们又在为类似分数的合成数据进行。