Although increasingly training-expensive, most self-supervised learning (SSL) models have repeatedly been trained from scratch but not fully utilized, since only a few SOTAs are employed for downstream tasks. In this work, we explore a sustainable SSL framework with two major challenges: i) learning a stronger new SSL model based on the existing pretrained SSL model, also called as "base" model, in a cost-friendly manner, ii) allowing the training of the new model to be compatible with various base models. We propose a Target-Enhanced Conditional (TEC) scheme which introduces two components to the existing mask-reconstruction based SSL. Firstly, we propose patch-relation enhanced targets which enhances the target given by base model and encourages the new model to learn semantic-relation knowledge from the base model by using incomplete inputs. This hardening and target-enhancing help the new model surpass the base model, since they enforce additional patch relation modeling to handle incomplete input. Secondly, we introduce a conditional adapter that adaptively adjusts new model prediction to align with the target of different base models. Extensive experimental results show that our TEC scheme can accelerate the learning speed, and also improve SOTA SSL base models, e.g., MAE and iBOT, taking an explorative step towards sustainable SSL.
翻译:虽然培训费用日益昂贵,但大多数自我监督的学习模式(SSL)从零开始反复培训,但没有充分利用,因为只有少数SOTA用于下游任务。在这项工作中,我们探索一个可持续的SSL框架,有两大挑战:一)学习以现有的预先培训的SSL模式(也称为“基准”模式)为基础的更强有力的SSL新模式,以成本友好的方式称为“基准”模式,二)使新模式的培训能够与各种基础模式兼容。我们提议了一个目标强化的条件性(TEC)计划,该计划为现有的基于SSL的遮罩重建软件引入了两个组成部分。首先,我们提出加强匹配关系的目标,加强基础模型设定的目标,鼓励新的模式通过不完全投入从基础模型学习语义关系知识。这种强化和增强目标有助于新模式超越基础模型,因为它们执行额外的补丁关系模型,处理不完全的投入。我们引入了一个有条件的适应性调整器,根据不同基础模型的目标调整新的模型预测。首先,我们提出加强匹配基建模型的目标,加强匹配基建模型的目标,鼓励新模型从基础模型中学习语义的语系关系,同时显示SOLT和MAT。