Self-training algorithms, which train a model to fit pseudolabels predicted by another previously-learned model, have been very successful for learning with unlabeled data using neural networks. However, the current theoretical understanding of self-training only applies to linear models. This work provides a unified theoretical analysis of self-training with deep networks for semi-supervised learning, unsupervised domain adaptation, and unsupervised learning. At the core of our analysis is a simple but realistic "expansion" assumption, which states that a low probability subset of the data must expand to a neighborhood with large probability relative to the subset. We also assume that neighborhoods of examples in different classes have minimal overlap. We prove that under these assumptions, the minimizers of population objectives based on self-training and input-consistency regularization will achieve high accuracy with respect to ground-truth labels. By using off-the-shelf generalization bounds, we immediately convert this result to sample complexity guarantees for neural nets that are polynomial in the margin and Lipschitzness. Our results help explain the empirical successes of recently proposed self-training algorithms which use input consistency regularization.
翻译:自我培训算法将模型训练成适合另一个先前获得的模型预测的假标签,非常成功地利用神经网络使用未贴标签的数据学习。 但是,目前对自我培训的理论理解只适用于线性模型。 这项工作提供了对自我培训的统一理论分析,与半监督学习、不受监督的域适应和不受监督的学习的深网络进行深度培训。 我们的分析核心是一个简单而现实的“扩展”假设,其中指出数据的一个低概率子集必须扩大到一个与子集相比概率大的社区。 我们还假设,不同类别中的一些实例有极少的重叠。 我们证明,根据这些假设,基于自我培训和投入一致性的规范化的人口目标的最小化者将获得高精度的地面图例标签。 通过使用现成的通用约束,我们立即将这一结果转换为在边际和利普施切茨基特的神经网的样本复杂性保证。 我们的结果有助于解释最近提出的使用投入一致性规范的自我培训算法的成功经验。