We consider the problem of training a deep neural network on a given classification task, e.g., ImageNet-1K (IN1K), so that it excels at that task as well as at other (future) transfer tasks. These two seemingly contradictory properties impose a trade-off between improving the model's generalization while maintaining its performance on the original task. Models trained with self-supervised learning (SSL) tend to generalize better than their supervised counterparts for transfer learning; yet, they still lag behind supervised models on IN1K. In this paper, we propose a supervised learning setup that leverages the best of both worlds. We enrich the common supervised training framework using two key components of recent SSL models: multi-scale crops for data augmentation and the use of an expendable projector head. We replace the last layer of class weights with class prototypes computed on the fly using a memory bank. We show that these three improvements lead to a more favorable trade-off between the IN1K training task and 13 transfer tasks. Over all the explored configurations, we single out two models: t-ReX that achieves a new state of the art for transfer learning and outperforms top methods such as DINO and PAWS on IN1K, and t-ReX* that matches the highly optimized RSB-A1 model on IN1K while performing better on transfer tasks. Project page and pretrained models: https://europe.naverlabs.com/t-rex
翻译:我们认为,在某个分类任务(例如,图像Net-1K(IN1K))上培训深心神经网络的问题在于如何培训一个深层神经网络,以便它能够胜任这项任务以及其他(未来)转移任务。这两个似乎矛盾的属性在改进模型的概括性的同时,又在原始任务上保持其绩效,在改进模型的概括性之间造成了一种权衡。自我监督学习模式(SSL)往往比受监督的同行更能概括性地推广学习;然而,它们仍然落后于IN1K的监管模型。在本文中,我们建议建立一个监督的学习机制,以利用两个世界的最佳力量。我们利用最近SSL模式的两个关键组成部分来丰富共同监督的培训框架:数据增强的多尺度作物和使用消耗性投影机头。我们用使用记忆库在飞行上计算的课程原型取代了最后一层的班级重量。我们显示,这三项改进导致在IN1K培训任务和13项转移任务之间实现更有利的交易。我们在所有探索的配置中,我们单挑出两个模型:RO-REX的模型,而RO-RISX的升级是用来进行高级艺术转让的方法。