Existing self-supervised learning methods learn representation by means of pretext tasks which are either (1) discriminating that explicitly specify which features should be separated or (2) aligning that precisely indicate which features should be closed together, but ignore the fact how to jointly and principally define which features to be repelled and which ones to be attracted. In this work, we combine the positive aspects of the discriminating and aligning methods, and design a hybrid method that addresses the above issue. Our method explicitly specifies the repulsion and attraction mechanism respectively by discriminative predictive task and concurrently maximizing mutual information between paired views sharing redundant information. We qualitatively and quantitatively show that our proposed model learns better features that are more effective for the diverse downstream tasks ranging from classification to semantic segmentation. Our experiments on nine established benchmarks show that the proposed model consistently outperforms the existing state-of-the-art results of self-supervised and transfer learning protocol.
翻译:现有的自我监督的学习方法通过借口任务来学习代表性,这些借口任务有:(1) 区别对待,明确指定哪些特征应当分开,或者(2) 协调,确切地指出哪些特征应当封闭在一起,但忽视了如何共同和主要界定哪些特征应予排除,哪些特征应当吸引。在这项工作中,我们结合了歧视性和统一方法的积极方面,并设计了一种解决上述问题的混合方法。我们的方法通过有区别的预测性任务,明确规定了排挤和吸引机制,并同时尽可能扩大对口观点之间分享多余信息的信息。我们从质量和数量上表明,我们提议的模型学习了更好的特征,这些特征对于从分类到语义分化等不同下游任务更为有效。我们在九个既定基准上的实验表明,拟议的模型始终超越了自监和转让学习协议的现有最新结果。