Self-training is a simple semi-supervised learning approach: Unlabelled examples that attract high-confidence predictions are labelled with their predictions and added to the training set, with this process being repeated multiple times. Recently, self-supervision -- learning without manual supervision by solving an automatically-generated pretext task -- has gained prominence in deep learning. This paper investigates three different ways of incorporating self-supervision into self-training to improve accuracy in image classification: self-supervision as pretraining only, self-supervision performed exclusively in the first iteration of self-training, and self-supervision added to every iteration of self-training. Empirical results on the SVHN, CIFAR-10, and PlantVillage datasets, using both training from scratch, and Imagenet-pretrained weights, show that applying self-supervision only in the first iteration of self-training can greatly improve accuracy, for a modest increase in computation time.
翻译:自我培训是一种简单的半监督的学习方法:吸引高度自信预测的未贴标签的例子在预测中贴上标签,并添加到培训中,这一过程反复出现。最近,自我监督 -- -- 通过解决自动产生的托辞任务,在没有人工监督的情况下学习 -- -- 在深层学习中越来越突出。 本文调查了将自我监督纳入自我培训的三种不同方法,以提高图像分类的准确性:自我监督作为仅培训前期的自我监督,在自我培训的第一次迭代中只进行自我监督,在每次迭代中都增加自我培训中的自我监督。 SVHN、CIFAR-10和Plant Village数据集的经验性结果,利用从零到图像网受限制的重量的培训,表明仅仅在自我培训的第一次迭代中应用自我监督才能大大提高准确性,而计算时间则略有增加。