Self-supervised learning has shown impressive results in downstream classification tasks. However, there is limited work in understanding their failure modes and interpreting their learned representations. In this paper, we study the representation space of six state-of-the-art self-supervised models including SimCLR, SwaV, MoCo, BYOL, DINO and SimSiam. Without the use of class label information, we discover highly activating features that correspond to unique physical attributes in images and exist mostly in correctly-classified representations. Using these features, we propose Self-Supervised Representation Quality Score (or Q-Score), a model-agnostic, unsupervised score that can reliably predict if a given sample is likely to be mis-classified during linear evaluation, achieving AUPRC of 91.45 on ImageNet-100 and 78.78 on ImageNet-1K. Q-Score can also be used as a regularization term on any self-supervised model to remedy low-quality representations through the course of pre-training. We show that pre-training with Q-Score regularization can boost the performance of six state-of-the-art self-supervised models on ImageNet-1K, ImageNet-100, CIFAR-10, CIFAR-100 and STL-10, showing an average relative increase of 1.8% top-1 accuracy on linear evaluation. On ImageNet-100, BYOL shows 7.2% relative improvement and on ImageNet-1K, SimCLR shows 4.7% relative improvement compared to their baselines. Finally, using gradient heatmaps and Salient ImageNet masks, we define a metric to quantify the interpretability of each representation. We show that highly activating features are strongly correlated to core attributes and enhancing these features through Q-score regularization improves the overall representation interpretability for all self-supervised models.
翻译:在下游的分类任务中,自监督的流学学习显示了令人印象深刻的结果。然而,在理解其失败模式和解释其所学表现方面,我们的工作有限。在本文中,我们研究了六个最先进的自我监督模型的显示空间,包括SimCLR、SwaV、Moco、BYOL、DINO和SimSiam。不使用类标签信息,我们发现与图像中独特的物理属性相对应的高度启动功能,并且大多存在正确的分类表达方式。使用这些特征,我们建议自我监督的Sweper Servical 质量评分(或Q-Score ),一个模型-Agnocial-Order 、一个模型-Smartal-Orality-Oralder-Oralder-Oral-Oral-Oral-Oral-Oral-Oral-Oralizal-Oral-Oralizal-I, Q-SL Q-Sal-ral-ral-ral-lational-Ial-Ial-Ial-Silal-Ial-Ial-Silal-Silal-Ial-Silal-I) 和Silal-Silal-Silal-Silal-SUL 显示所有SU-SU-Sl-SU-SL 和SL 上,在SL 上,通过训练-I-SL-SL-SLT-SL-SLlation-I-I-SL-SLM-SLM-SU-SL-SL-IM-SL-SL-SL-SL-SL-SL-SL-SL 模型上,提高的改进性评估中,所有。 和SAL-SL-SL-SL-SL-SL-SAL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SB-SL-SL 和SL-SL-S