We propose self-adaptive training -- a unified training algorithm that dynamically calibrates and enhances training processes by model predictions without incurring an extra computational cost -- to advance both supervised and self-supervised learning of deep neural networks. We analyze the training dynamics of deep networks on training data that are corrupted by, e.g., random noise and adversarial examples. Our analysis shows that model predictions are able to magnify useful underlying information in data and this phenomenon occurs broadly even in the absence of any label information, highlighting that model predictions could substantially benefit the training processes: self-adaptive training improves the generalization of deep networks under noise and enhances the self-supervised representation learning. The analysis also sheds light on understanding deep learning, e.g., a potential explanation of the recently-discovered double-descent phenomenon in empirical risk minimization and the collapsing issue of the state-of-the-art self-supervised learning algorithms. Experiments on the CIFAR, STL, and ImageNet datasets verify the effectiveness of our approach in three applications: classification with label noise, selective classification, and linear evaluation. To facilitate future research, the code has been made publicly available at https://github.com/LayneH/self-adaptive-training.
翻译:我们提议进行自我适应培训 -- -- 一种统一的培训算法,通过模型预测动态地校准和加强培训过程,而不产生额外的计算费用 -- -- 推进深神经网络的监督和自我监督的学习。我们分析关于培训数据的深层网络的培训动态,这些数据因随机噪音和对抗性例子等而腐蚀。我们的分析表明,模型预测能够扩大数据中的有用基本信息,即使在没有任何标签信息的情况下,这种现象也广泛发生。强调模型预测可大大有利于培训进程:自我适应培训提高噪音下深网络的通用性,加强自我监督的演示学习。分析还揭示了对深层次学习的理解,例如,对最近在实验风险最小化中发现的双月现象的潜在解释,以及最先进的自我控制学习算法的崩溃问题。关于CIFAR、STL和图像网络数据集的实验可以大大有利于培训进程:自我适应培训培训,提高在噪音下深层网络的普及程度,加强自我监督的演示教学学习过程。在标签噪音、选择性分类、选择性分类和线性评估方面进行公开研究。