Graph classification is an important area in both modern research and industry. Multiple applications, especially in chemistry and novel drug discovery, encourage rapid development of machine learning models in this area. To keep up with the pace of new research, proper experimental design, fair evaluation, and independent benchmarks are essential. Design of strong baselines is an indispensable element of such works. In this thesis, we explore multiple approaches to graph classification. We focus on Graph Neural Networks (GNNs), which emerged as a de facto standard deep learning technique for graph representation learning. Classical approaches, such as graph descriptors and molecular fingerprints, are also addressed. We design fair evaluation experimental protocol and choose proper datasets collection. This allows us to perform numerous experiments and rigorously analyze modern approaches. We arrive to many conclusions, which shed new light on performance and quality of novel algorithms. We investigate application of Jumping Knowledge GNN architecture to graph classification, which proves to be an efficient tool for improving base graph neural network architectures. Multiple improvements to baseline models are also proposed and experimentally verified, which constitutes an important contribution to the field of fair model comparison.
翻译:图表分类是现代研究和工业的一个重要领域。 多种应用,特别是化学和新药物发现方面的应用,鼓励迅速开发这一领域的机器学习模式。 为了跟上新研究的步伐, 适当的实验设计、 公平评估和独立基准是必不可少的。 设计强大的基线是这类工程的一个不可或缺的要素。 在这个论文中, 我们探索图形分类的多种方法。 我们侧重于图形神经网络(GNNs), 它在图形显示学习中成为事实上标准的深层次学习技术。 也涉及典型方法, 如图形描述器和分子指纹等。 我们设计了公平的评估实验协议,并选择了适当的数据集。 这使得我们能够进行许多实验并严格地分析现代方法。 我们得出了许多结论,这些结论为新算法的性能和质量提供了新的亮度。 我们研究了如何应用跳动知识GNNN结构来进行图形分类,这证明是改进基本图形神经网络结构的一个有效工具。 对基线模型的多项改进也被提出并进行了实验性核实,这对公平模型比较领域作出了重要贡献。