A mainstream type of current self-supervised learning methods pursues a general-purpose representation that can be well transferred to downstream tasks, typically by optimizing on a given pretext task such as instance discrimination. In this work, we argue that existing pretext tasks inevitably introduce biases into the learned representation, which in turn leads to biased transfer performance on various downstream tasks. To cope with this issue, we propose Maximum Entropy Coding (MEC), a more principled objective that explicitly optimizes on the structure of the representation, so that the learned representation is less biased and thus generalizes better to unseen downstream tasks. Inspired by the principle of maximum entropy in information theory, we hypothesize that a generalizable representation should be the one that admits the maximum entropy among all plausible representations. To make the objective end-to-end trainable, we propose to leverage the minimal coding length in lossy data coding as a computationally tractable surrogate for the entropy, and further derive a scalable reformulation of the objective that allows fast computation. Extensive experiments demonstrate that MEC learns a more generalizable representation than previous methods based on specific pretext tasks. It achieves state-of-the-art performance consistently on various downstream tasks, including not only ImageNet linear probe, but also semi-supervised classification, object detection, instance segmentation, and object tracking. Interestingly, we show that existing batch-wise and feature-wise self-supervised objectives could be seen equivalent to low-order approximations of MEC. Code and pre-trained models are available at https://github.com/xinliu20/MEC.
翻译:主流的当前自我监督学习方法采用一般目的代表制,这种代表制可以很好地转移到下游任务,通常通过优化某种托辞任务(例如实例歧视)进行优化。在这项工作中,我们争辩说,现有的托辞任务不可避免地将偏向引入学术代表制,这反过来又导致在各种下游任务上有偏向性地转移业绩。为了解决这个问题,我们提议利用最小的损耗数据编码(Mest Entropy Coding)这个原则性目标,明确优化代表制结构,以便学习到的代表制更加偏差,从而更概括到隐蔽的下游任务。在信息理论中最大限度地放大的原则的启发下,我们假设,普遍性代表制应该是在所有可信的代表制代表制中承认最大限度的偏向性。为了使客观端对端对端到端的调效性,我们提议利用最小的丢失数据编码(Mentrouty Enal Encretaility commissional)作为计算前可移动的缩略图,进一步对目标进行可变缩的重新表述。 广泛的实验表明,相对于以往的客观代表制目标比以往的标值,而不是根据具体的底线性任务显示,也显示。