Graph embedding provides a feasible methodology to conduct pattern classification for graph-structured data by mapping each data into the vectorial space. Various pioneering works are essentially coding method that concentrates on a vectorial representation about the inner properties of a graph in terms of the topological constitution, node attributions, link relations, etc. However, the classification for each targeted data is a qualitative issue based on understanding the overall discrepancies within the dataset scale. From the statistical point of view, these discrepancies manifest a metric distribution over the dataset scale if the distance metric is adopted to measure the pairwise similarity or dissimilarity. Therefore, we present a novel embedding strategy named $\mathbf{MetricDistribution2vec}$ to extract such distribution characteristics into the vectorial representation for each data. We demonstrate the application and effectiveness of our representation method in the supervised prediction tasks on extensive real-world structural graph datasets. The results have gained some unexpected increases compared with a surge of baselines on all the datasets, even if we take the lightweight models as classifiers. Moreover, the proposed methods also conducted experiments in Few-Shot classification scenarios, and the results still show attractive discrimination in rare training samples based inference.
翻译:嵌入图案提供了一种可行的方法,通过将每个数据映射到矢量空间,对图表结构的数据进行模式分类。各种开创性工作基本上是一种编码方法,其重点是从地形结构、节点归属、链接关系等角度对图表的内部属性进行矢量表示。然而,对每项目标数据的分类是一个定性问题,其依据是了解数据集规模内的总体差异。从统计角度看,这些差异表明,如果采用距离度量来衡量对称相似性或差异性,则在数据集尺度上进行量度分布。因此,我们提出了一个名为$\mathbf{MetricDisdition2vec}的新型嵌入战略,以将这种分布特征纳入每种数据的矢量表示中。我们展示了我们在广泛真实世界结构图表数据集监督的预测任务中的表述方法的应用和有效性。从统计角度看,这些差异与所有数据集的基线激增相比,取得了一些意外的增加,即使我们将轻量模型作为分类师。此外,我们提出的在稀有代表性的模型中进行实验的方法也显示有吸引力。