Recently, collaborative learning proposed by Song and Chai has achieved remarkable improvements in image classification tasks by simultaneously training multiple classifier heads. However, huge memory footprints required by such multi-head structures may hinder the training of large-capacity baseline models. The natural question is how to achieve collaborative learning within a single network without duplicating any modules. In this paper, we propose four ways of collaborative learning among different parts of a single network with negligible engineering efforts. To improve the robustness of the network, we leverage the consistency of the output layer and intermediate layers for training under the collaborative learning framework. Besides, the similarity of intermediate representation and convolution kernel is also introduced to reduce the reduce redundant in a neural network. Compared to the method of Song and Chai, our framework further considers the collaboration inside a single model and takes smaller overhead. Extensive experiments on Cifar-10, Cifar-100, ImageNet32 and STL-10 corroborate the effectiveness of these four ways separately while combining them leads to further improvements. In particular, test errors on the STL-10 dataset are decreased by $9.28\%$ and $5.45\%$ for ResNet-18 and VGG-16 respectively. Moreover, our method is proven to be robust to label noise with experiments on Cifar-10 dataset. For example, our method has $3.53\%$ higher performance under $50\%$ noise ratio setting.
翻译:最近,Song和Chai提出的合作学习建议通过同时培训多个分类负责人,在图像分类任务方面取得了显著的改善;然而,这种多头结构所需的大量记忆足迹可能会妨碍大型基线模型的培训;自然的问题是如何在一个单一网络内实现合作学习,而不重复任何模块;在本文件中,我们提出了单一网络不同部分之间合作学习的四种方式,而工程努力微不足道;为了提高网络的稳健性,我们利用产出层和中间层的一致性,在合作学习框架下进行培训;此外,为了减少神经网络中的冗余,还引入了中间代表制和组合内核的相似性,以减少神经网络中的冗余。与宋和Chai的方法相比,我们的框架进一步考虑在一个单一模式内进行协作,并采用较小的间接费用。在Cifar-10、Cifar-100、图像网32和STL-10上进行的广泛实验证实了这四种方法的有效性,同时将它们结合起来,可以导致进一步的改进。特别是,STL-10数据集的测试误差减少了9.28美元和5.45美元,用于ResNet-18和VGG-16的超额费用。此外,我们以50-10号标准数据标签的测试方法是可靠的。