Self-supervised visual representation learning has recently attracted significant research interest. While a common way to evaluate self-supervised representations is through transfer to various downstream tasks, we instead investigate the problem of measuring their interpretability, i.e. understanding the semantics encoded in raw representations. We formulate the latter as estimating the mutual information between the representation and a space of manually labelled concepts. To quantify this we introduce a decoding bottleneck: information must be captured by simple predictors, mapping concepts to clusters in representation space. This approach, which we call reverse linear probing, provides a single number sensitive to the semanticity of the representation. This measure is also able to detect when the representation contains combinations of concepts (e.g., "red apple") instead of just individual attributes ("red" and "apple" independently). Finally, we propose to use supervised classifiers to automatically label large datasets in order to enrich the space of concepts used for probing. We use our method to evaluate a large number of self-supervised representations, ranking them by interpretability, highlight the differences that emerge compared to the standard evaluation with linear probes and discuss several qualitative insights. Code at: {\scriptsize{\url{https://github.com/iro-cp/ssl-qrp}}}.
翻译:自我监督的视觉代表性学习最近引起了重要的研究兴趣。 虽然评价自我监督的表达方式的一个常见方法是通过向各种下游任务转移,但我们却调查了如何衡量其可解释性的问题,即理解原始代表中编码的语义学。 我们将后者设计成估算代表和人工标签概念空间之间的相互信息。 为了量化这一点,我们引入了一个解码瓶颈:信息必须由简单的预测器采集,将概念映射到代表空间的分组中。我们称之为反向线性勘测,它提供了一种对代表性的语义敏感的单一数字。当表达方式包含概念的组合(例如,“红色苹果”)而不是仅仅单个属性(“red”和“apple”)。 最后,我们提议使用受监督的分类器自动标注大型数据集,以丰富用于勘测的概念空间。 我们使用我们的方法评估大量自我监督的表达方式,按可解释性排列它们的位置,突出与标准直观/直观评估相比出现的差异。 我们提议使用监督的分类器对一些直观/正反向/正反向的表达方式进行讨论。