Deep neural networks are powerful, massively parameterized machine learning models that have been shown to perform well in supervised learning tasks. However, very large amounts of labeled data are usually needed to train deep neural networks. Several semi-supervised learning approaches have been proposed to train neural networks using smaller amounts of labeled data with a large amount of unlabeled data. The performance of these semi-supervised methods significantly degrades as the size of labeled data decreases. We introduce Mutual-information-based Unsupervised & Semi-supervised Concurrent LEarning (MUSCLE), a hybrid learning approach that uses mutual information to combine both unsupervised and semi-supervised learning. MUSCLE can be used as a stand-alone training scheme for neural networks, and can also be incorporated into other learning approaches. We show that the proposed hybrid model outperforms state of the art on several standard benchmarks, including CIFAR-10, CIFAR-100, and Mini-Imagenet. Furthermore, the performance gain consistently increases with the reduction in the amount of labeled data, as well as in the presence of bias. We also show that MUSCLE has the potential to boost the classification performance when used in the fine-tuning phase for a model pre-trained only on unlabeled data.
翻译:深心神经网络是强大、大规模参数化的机器学习模型,在监督的学习任务中表现良好,然而,通常需要大量标签数据来训练深心神经网络。一些半监督的学习方法已经提出,利用少量标签数据来训练神经网络,使用大量未贴标签数据。这些半监督方法的性能随着标签数据规模的缩小而大大降低。我们引入了基于信息的无监管和半监督的双向平行LEARINGE(MUSCLE),这是一种混合学习方法,使用相互信息将不受监管和半监督的学习结合起来。MUSCLE可以用作神经网络的单独培训计划,也可以纳入其他学习方法。我们显示,拟议的混合模型在包括CFAR-10、CIFAR-100和Mini-Imagenet(MUSCLE)等若干标准基准上,其性能大大超过艺术的状态。此外,随着标签型数据数量的减少,业绩不断提高,在升级前的等级中,我们也显示,在改进了模范式数据时,在改进了模级的等级。