Feature generation is an open topic of investigation in graph machine learning. In this paper, we study the use of graph homomorphism density features as a scalable alternative to homomorphism numbers which retain similar theoretical properties and ability to take into account inductive bias. For this, we propose a high-performance implementation of a simple sampling algorithm which computes additive approximations of homomorphism densities. In the context of graph machine learning, we demonstrate in experiments that simple linear models trained on sample homomorphism densities can achieve performance comparable to graph neural networks on standard graph classification datasets. Finally, we show in experiments on synthetic data that this algorithm scales to very large graphs when implemented with Bloom filters.
翻译:地物生成是图形机器学习的一个开放式调查课题。 在本文中, 我们研究使用图形同质性密度特征作为可缩放的替代方法, 以取代具有类似理论属性的同质性数字, 并有能力考虑到感化偏差。 为此, 我们建议采用高性能的简单抽样算法, 计算同质性密度的叠加近似值。 在图形机器学习中, 我们通过实验证明, 接受过关于同质性密度样本培训的简单线性模型能够达到与标准图表分类数据集中的图形神经网络相似的性能。 最后, 我们通过合成数据的实验显示, 当与Bloom过滤器一起应用时, 这个算法将非常大的图表比例提升到非常大的图表 。