Nearest neighbour graphs are widely used to capture the geometry or topology of a dataset. One of the most common strategies to construct such a graph is based on selecting a fixed number k of nearest neighbours (kNN) for each point. However, the kNN heuristic may become inappropriate when sampling density or noise level varies across datasets. Strategies that try to get around this typically introduce additional parameters that need to be tuned. We propose a simple approach to construct an adaptive neighbourhood graph from a single parameter, based on quadratically regularised optimal transport. Our numerical experiments show that graphs constructed in this manner perform favourably in unsupervised and semi-supervised learning applications.
翻译:近邻图图被广泛用来捕捉数据集的几何或地形学。构建此图的最常见策略之一是为每个点选择固定的 k 位近邻( kNN) 。 但是, 当抽样密度或噪音水平各数据集之间变化时, kNN 杂交可能变得不合适。 试图绕过这个位置的战略通常会引入需要调整的额外参数。 我们提出了一个简单的方法,用一个单一参数来构建一个适应性邻里图, 其依据是四端常规最佳迁移。 我们的数字实验显示, 以这种方式构建的图表在不受监督和半监督的学习应用中表现良好 。