In the last few years, graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. This emerging field has witnessed an extensive growth of promising techniques that have been applied with success to computer science, mathematics, biology, physics and chemistry. But for any successful field to become mainstream and reliable, benchmarks must be developed to quantify progress. This led us in March 2020 to release a benchmark framework that i) comprises of a diverse collection of mathematical and real-world graphs, ii) enables fair model comparison with the same parameter budget to identify key architectures, iii) has an open-source, easy-to-use and reproducible code infrastructure, and iv) is flexible for researchers to experiment with new theoretical ideas. As of December 2022, the GitHub repository has reached 2,000 stars and 380 forks, which demonstrates the utility of the proposed open-source framework through the wide usage by the GNN community. In this paper, we present an updated version of our benchmark with a concise presentation of the aforementioned framework characteristics, an additional medium-sized molecular dataset AQSOL, similar to the popular ZINC, but with a real-world measured chemical target, and discuss how this framework can be leveraged to explore new GNN designs and insights. As a proof of value of our benchmark, we study the case of graph positional encoding (PE) in GNNs, which was introduced with this benchmark and has since spurred interest of exploring more powerful PE for Transformers and GNNs in a robust experimental setting.
翻译:在过去几年里,平面神经网络(GNNS)已成为分析和从图表数据中学习的标准工具;这个新兴领域见证了前景良好的技术的广泛增长,这些技术成功地应用于计算机科学、数学、生物学、物理和化学;但任何成功的领域要想成为主流和可靠,就必须制定基准,以量化进展;这导致我们在2020年3月发布了一个基准框架,该框架(i) 由多种数学和真实世界图集组成,(ii) 能够与同一参数预算进行公平的模型比较,以确定关键结构,(iii) 拥有开放源、易于使用和可复制的代码基础设施,以及(iv) 使研究人员能够灵活地尝试新的理论理念。截至2022年12月,GitHub仓库已经达到2,000个恒星和380个叉,这表明拟议的开放源框架在GNNNN社群的广泛使用中很有用处。 在本文中,我们展示了与上述框架特征的精细的精细度展示,额外的中量分子数据集 AQSOL,类似于G的实验性G值基准。