Computer vision is driven by the many datasets available for training or evaluating novel methods. However, each dataset has a different set of class labels, visual definition of classes, images following a specific distribution, annotation protocols, etc. In this paper we explore the automatic discovery of visual-semantic relations between labels across datasets. We aim to understand how instances of a certain class in a dataset relate to the instances of another class in another dataset. Are they in an identity, parent/child, overlap relation? Or is there no link between them at all? To find relations between labels across datasets, we propose methods based on language, on vision, and on their combination. We show that we can effectively discover label relations across datasets, as well as their type. We apply our method to four applications: understand label relations, identify missing aspects, increase label specificity, and predict transfer learning gains. We conclude that label relations cannot be established by looking at the names of classes alone, as they depend strongly on how each of the datasets was constructed.
翻译:计算机的视觉是由许多可用于培训或评估新方法的数据集驱动的。 但是, 每个数据集都有一套不同的分类标签、 分类的视觉定义、 特定分布、 注释协议之后的图像等等。 在本文中, 我们探索在跨数据集的标签之间自动发现视觉- 语系关系。 我们的目标是了解数据集中某一类的事例与另一个数据集中另一类的事例有什么关系。 它们是否属于身份、 父/子、 重叠关系? 或者它们之间完全没有联系? 要找到跨数据集的标签之间的关系, 我们建议基于语言、 视觉和组合的方法。 我们显示我们能够有效地发现跨数据集的标签关系, 以及它们的类型。 我们用我们的方法应用于四个应用程序: 理解标签关系, 识别缺失的方面, 增加标签特性, 并预测转移学习收益。 我们的结论是, 标签关系无法通过光看分类的名称来建立, 因为它们在很大程度上取决于每个数据集是如何构建的 。