Graph-based semi-supervised regression (SSR) is the problem of estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices. This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered nodes. We present a Bayesian formulation of SSR in which the weighted graph defines a Gaussian prior, using a graph Laplacian, and the labeled data defines a likelihood. We analyze the rate of contraction of the posterior measure around the ground truth in terms of parameters that quantify the small label error and inherent clustering in the graph. We obtain bounds on the rates of contraction and illustrate their sharpness through numerical experiments. The analysis also gives insight into the choice of hyperparameters that enter the definition of the prior.
翻译:基于图形的半监督回归(SSR)是一个问题,它涉及从一个加权图表的值(标签)中估计一个函数值的问题,即从一个小脊椎子的值(标签)中估算一个加权图表上的函数值。本文涉及在分类方面,在标签有小噪音的环境下,在标签有细小噪音的设置中,SSR的一致性,而基本图形加权的加权图形加权加权的回归(SSR)与高频节点一致。我们提出了一个SSR的贝叶斯式配方,其中加权图形用一个图解了Gaussian之前的值,而标签数据则定义了可能性。我们从量化小标签错误和图内固有集群的参数的角度,分析了地面外表测量的缩缩速率。我们获得了收缩率的界限,并通过数字实验来说明其清晰度。我们的分析还洞察了进入前一个定义的超参数的选择。