Given the potential difficulties in obtaining large quantities of labelled data, many works have explored the use of deep semi-supervised learning, which uses both labelled and unlabelled data to train a neural network architecture. The vast majority of SSL approaches focus on implementing the low-density separation assumption or consistency assumption, the idea that decision boundaries should lie in low density regions. However, they have implemented this assumption by making local changes to the decision boundary at each data point, ignoring the global structure of the data. In this work, we explore an alternative approach using the global information present in the clustered data to update our decision boundaries. We propose a novel framework, CycleCluster, for deep semi-supervised classification. Our core optimisation is driven by a new clustering based regularisation along with a graph based pseudo-labels and a shared deep network. Demonstrating that direct implementation of the cluster assumption is a viable alternative to the popular consistency based regularisation. We demonstrate the predictive capability of our technique through a careful set of numerical results.
翻译:鉴于获取大量贴标签数据的潜在困难,许多工作探索了使用深层次半监督的学习方法,该方法使用贴标签和无标签的数据来训练神经网络结构;绝大多数SSL方法侧重于实施低密度分离假设或一致性假设,即决定界限应位于低密度区域;然而,它们通过在每一个数据点对决定边界进行局部修改,无视数据的全球结构,落实了这一假设;在这项工作中,我们探索了一种替代方法,利用分组数据中的全球信息来更新我们的决定边界;我们提出了一个新颖的框架,即CycroClluster,用于深层次的半监督分类;我们的核心优化是由基于基于图形的假标签和共同深度网络的新的组合驱动的;表明直接实施集群假设是公众基于一致性的一种可行的替代办法;我们通过一套谨慎的数字结果来展示我们技术的预测能力。