Learning the similarity between images constitutes the foundation for numerous vision tasks. The common paradigm is discriminative metric learning, which seeks an embedding that separates different training classes. However, the main challenge is to learn a metric that not only generalizes from training to novel, but related, test samples. It should also transfer to different object classes. So what complementary information is missed by the discriminative paradigm? Besides finding characteristics that separate between classes, we also need them to likely occur in novel categories, which is indicated if they are shared across training classes. This work investigates how to learn such characteristics without the need for extra annotations or training data. By formulating our approach as a novel triplet sampling strategy, it can be easily applied on top of recent ranking loss frameworks. Experiments show that, independent of the underlying network architecture and the specific ranking loss, our approach significantly improves performance in deep metric learning, leading to new the state-of-the-art results on various standard benchmark datasets. Preliminary early access page can be found here: https://ieeexplore.ieee.org/document/9141449
翻译:学习图像之间的相似性是众多视觉任务的基础。 常见的范式是歧视性的衡量学习, 它寻求嵌入不同训练课程的嵌入。 然而, 主要的挑战是如何学习一个不仅从培训到新颖的测试样本, 而且还应该将测试样本推广到不同的对象类别。 因此, 歧视模式会忽略哪些补充信息? 除了发现不同类别之间的不同特征外, 我们还需要它们有可能出现在新颖的类别中, 如果它们在所有培训课程中共享的话, 就会显示它们。 这项工作研究如何在不需要额外说明或培训数据的情况下学习这些特征。 通过制定我们的新三重抽样战略, 它可以很容易地应用到最近的排名损失框架之上。 实验表明, 与基本网络结构和具体的排名损失无关, 我们的方法大大改善了深层次的衡量学习绩效, 导致在各种标准基准数据集上产生新的状态- 。 初步访问网页可以在这里找到 : https://ieflore. ieee/ document/ 911449。