This work simultaneously considers the discriminability and transferability properties of deep representations in the typical supervised learning task, i.e., image classification. By a comprehensive temporal analysis, we observe a trade-off between these two properties. The discriminability keeps increasing with the training progressing while the transferability intensely diminishes in the later training period. From the perspective of information-bottleneck theory, we reveal that the incompatibility between discriminability and transferability is attributed to the over-compression of input information. More importantly, we investigate why and how the InfoNCE loss can alleviate the over-compression, and further present a learning framework, named contrastive temporal coding~(CTC), to counteract the over-compression and alleviate the incompatibility. Extensive experiments validate that CTC successfully mitigates the incompatibility, yielding discriminative and transferable representations. Noticeable improvements are achieved on the image classification task and challenging transfer learning tasks. We hope that this work will raise the significance of the transferability property in the conventional supervised learning setting. Code is available at https://github.com/DTennant/dt-tradeoff.
翻译:这项工作同时考虑了在典型监督的学习任务(即图像分类)中深层表述的可区别性和可转移性。通过全面的时间分析,我们观察到这两种属性之间的一种权衡取舍。随着培训的进展,差异在不断增加,而转让在以后的培训期间急剧减少。从信息瓶颈理论的角度来看,我们发现,差异性和可转移性之间的不兼容性归因于投入信息的过度压缩。更重要的是,我们调查InfoNCE损失为什么和如何减轻过度压缩,并进一步提出一个称为对比性时间编码(CTC)的学习框架,以抵消过度压缩和缓解不兼容性。广泛的实验证实,CTC成功地缓解了不兼容性,产生了歧视性和可转让的表述。在图像分类任务和转让学习任务上取得了显著的改进。我们希望这项工作将提高可转让性财产在常规监督学习环境中的重要性。《守则》可在https://github.com/DTennant/dt-traffice查阅。