Traditional classification tasks learn to assign samples to given classes based solely on sample features. This paradigm is evolving to include other sources of information, such as known relations between samples. Here we show that, even if additional relational information is not available in the data set, one can improve classification by constructing geometric graphs from the features themselves, and using them within a Graph Convolutional Network. The improvement in classification accuracy is maximized by graphs that capture sample similarity with relatively low edge density. We show that such feature-derived graphs increase the alignment of the data to the ground truth while improving class separation. We also demonstrate that the graphs can be made more efficient using spectral sparsification, which reduces the number of edges while still improving classification performance. We illustrate our findings using synthetic and real-world data sets from various scientific domains.
翻译:传统分类任务学会只根据抽样特征将样本分配到特定类别。 这种模式正在演变,包括其他信息来源,例如已知的样本之间的关系。 我们在这里表明,即使数据集中没有额外的关联信息,人们也可以通过从特征本身构造几何图,并在图集变异网络中使用这些图来改进分类。 分类准确性的提高通过图表得到最大程度的提高,这些图通过采集样本与相对较低边缘密度相似的样本。 我们显示,这些地貌图表在改进分类分离的同时,提高了数据与地面真相的比对。 我们还表明,利用光谱透析可以提高图表的效率,通过光谱透析减少边缘数量,同时不断改进分类性能。 我们用各种科学领域的合成和真实世界数据集来说明我们的调查结果。