Graph structured data is ubiquitous in daily life and scientific areas and has attracted increasing attention. Graph Neural Networks (GNNs) have been proved to be effective in modeling graph structured data and many variants of GNN architectures have been proposed. However, much human effort is often needed to tune the architecture depending on different datasets. Researchers naturally adopt Automated Machine Learning on Graph Learning, aiming to reduce the human effort and achieve generally top-performing GNNs, but their methods focus more on the architecture search. To understand GNN practitioners' automated solutions, we organized AutoGraph Challenge at KDD Cup 2020, emphasizing on automated graph neural networks for node classification. We received top solutions especially from industrial tech companies like Meituan, Alibaba and Twitter, which are already open sourced on Github. After detailed comparisons with solutions from academia, we quantify the gaps between academia and industry on modeling scope, effectiveness and efficiency, and show that (1) academia AutoML for Graph solutions focus on GNN architecture search while industrial solutions, especially the winning ones in the KDD Cup, tend to obtain an overall solution (2) by neural architecture search only, academia solutions achieve on average 97.3% accuracy of industrial solutions (3) academia solutions are cheap to obtain with several GPU hours while industrial solutions take a few months' labors. Academic solutions also contain much fewer parameters.
翻译:图形神经网络(GNNs)在模拟图形结构数据方面已证明是有效的,并提出了许多GNN结构结构的变体。然而,往往需要大量人力来根据不同的数据集调整结构。研究人员自然采用图学自动机学习方法,目的是减少人类的努力,实现总体绩效最高的GNNs,但他们的方法更侧重于建筑搜索。为了理解GNN从业人员的自动化解决方案,我们在KDD Cup 2020 上组织了AutoGraph挑战,强调自动图形神经网络进行节点分类。我们收到了顶级解决方案,特别是来自Meituan、Alibaba和Twitter等工业技术公司,这些工业技术公司已经在Github上公开提供。在与学术界的解决方案进行详细比较后,我们量化了学术界和产业界之间在模型范围、效力和效率方面的差距,并显示(1) 学术界用于图形解决方案的自动化搜索侧重于GNNNE,而工业解决方案,特别是KDDC杯的获奖者则倾向于获得总体解决方案(2),而通过神经结构搜索数个数字的产业解决方案,同时也只有97%的学术解决方案。