Semi-supervised learning is highly useful in common scenarios where labeled data is scarce but unlabeled data is abundant. The graph (or nonlocal) Laplacian is a fundamental smoothing operator for solving various learning tasks. For unsupervised clustering, a spectral embedding is often used, based on graph-Laplacian eigenvectors. For semi-supervised problems, the common approach is to solve a constrained optimization problem, regularized by a Dirichlet energy, based on the graph-Laplacian. However, as supervision decreases, Dirichlet optimization becomes suboptimal. We therefore would like to obtain a smooth transition between unsupervised clustering and low-supervised graph-based classification. In this paper, we propose a new type of graph-Laplacian which is adapted for Semi-Supervised Learning (SSL) problems. It is based on both density and contrastive measures and allows the encoding of the labeled data directly in the operator. Thus, we can perform successfully semi-supervised learning using spectral clustering. The benefits of our approach are illustrated for several SSL problems.
翻译:在标签数据稀少但无标签数据丰富的常见情况下,半监控学习非常有用。 图表( 或非本地) Laplacian 是解决各种学习任务的基本平滑操作器。 对于不受监督的分组, 经常使用光谱嵌入器。 对于半监控问题, 常见的方法是解决限制的优化问题, 由基于图形- Laplacian的 Dirichlet 能源规范化。 然而, 随着监管的减少, Dirichlet 优化变得亚最佳。 因此, 我们想要在非监督的集群和低监控的图形基分类之间实现平稳过渡。 在本文中, 我们提出了一种新的图形- Laplaceian, 适应了半监控学习( SSL) 的问题。 它基于密度和对比性措施, 并允许在操作者中直接对标签数据进行编码。 因此, 我们可以通过光谱聚合成功进行半监督的学习。 我们的方法的好处为几个 SSL 提供了演示。