Since early machine learning models, metrics such as accuracy and precision have been the de facto way to evaluate and compare trained models. However, a single metric number doesn't fully capture the similarities and differences between models, especially in the computer vision domain. A model with high accuracy on a certain dataset might provide a lower accuracy on another dataset, without any further insights. To address this problem we build on a recent interpretability technique called Dissect to introduce \textit{inter-model interpretability}, which determines how models relate or complement each other based on the visual concepts they have learned (such as objects and materials). Towards this goal, we project 13 top-performing self-supervised models into a Learned Concepts Embedding (LCE) space that reveals proximities among models from the perspective of learned concepts. We further crossed this information with the performance of these models on four computer vision tasks and 15 datasets. The experiment allowed us to categorize the models into three categories and revealed for the first time the type of visual concepts different tasks requires. This is a step forward for designing cross-task learning algorithms.
翻译:自早期机器学习模型以来,精确度和精确度等衡量标准事实上一直是评估和比较经过训练的模型的方法。然而,单一的计量数字并不完全反映各种模型之间的异同,特别是在计算机视觉领域。在一个特定数据集上高度精确的模型可能为另一个数据集提供更低的准确度,而没有任何进一步的洞察力。为了解决这一问题,我们利用了一种最新的可解释性技术,称为Discrequet,以引入\textit{inter-model可解释性},该技术决定了模型根据所学的视觉概念(例如物体和材料)如何相互关联或相互补充。为实现这一目标,我们将13个业绩最佳的自我监督模型投放到一个从所学概念角度显示各模型之间近似性的“嵌入式”空间中。我们进一步将这一信息与这些模型在四种计算机视觉任务和15个数据集上的性能结合起来。实验使我们能够将模型分为三类,并首次披露视觉概念所需的不同任务类型。这是设计跨式学习算法的向前迈出的一步。