动态计量学习:迈向一个可伸缩的计量空间,以适应多语义尺度 (Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales)

This paper introduces a new fundamental characteristic, \ie, the dynamic range, from real-world metric tools to deep visual recognition. In metrology, the dynamic range is a basic quality of a metric tool, indicating its flexibility to accommodate various scales. Larger dynamic range offers higher flexibility. In visual recognition, the multiple scale problem also exist. Different visual concepts may have different semantic scales. For example, ``Animal'' and ``Plants'' have a large semantic scale while ``Elk'' has a much smaller one. Under a small semantic scale, two different elks may look quite \emph{different} to each other . However, under a large semantic scale (\eg, animals and plants), these two elks should be measured as being \emph{similar}. %We argue that such flexibility is also important for deep metric learning, because different visual concepts indeed correspond to different semantic scales. Introducing the dynamic range to deep metric learning, we get a novel computer vision task, \ie, the Dynamic Metric Learning. It aims to learn a scalable metric space to accommodate visual concepts across multiple semantic scales. Based on three types of images, \emph{i.e.}, vehicle, animal and online products, we construct three datasets for Dynamic Metric Learning. We benchmark these datasets with popular deep metric learning methods and find Dynamic Metric Learning to be very challenging. The major difficulty lies in a conflict between different scales: the discriminative ability under a small scale usually compromises the discriminative ability under a large one, and vice versa. As a minor contribution, we propose Cross-Scale Learning (CSL) to alleviate such conflict. We show that CSL consistently improves the baseline on all the three datasets. The datasets and the code will be publicly available at https://github.com/SupetZYK/DynamicMetricLearning.

翻译：本文引入了一个新的基本特征, 即 \, 动态范围, 从真实世界的衡量工具到深层次的视觉识别。在计量学中, 动态范围是衡量工具的基本质量, 表明其适应不同尺度的灵活性。更大的动态范围提供更大的灵活性。在视觉识别中, 多尺度问题也存在。不同的视觉概念可能具有不同的语义尺度。例如, “ 动物” 和“ Plants” 具有很大的语义尺度, 而“ Elk” 则有更小的语义尺度。在小语义尺度下, 两只不同的精灵可能看起来很深层次的。但是, 在巨大的语义规模( 动物和植物), 动态范围, 显示两个不同的语言范围, 显示一个具有挑战性的C/ metrical 定义。我们试图在三大语言数据库中, 显示一个可以持续变义的数学模型。我们用三种视觉模型, 显示一个可以持续学习的数学模型。

相关内容

度量学习

关注 3372

度量学习的目的为了衡量样本之间的相近程度，而这也正是模式识别的核心问题之一。大量的机器学习方法，比如K近邻、支持向量机、径向基函数网络等分类方法以及K-means聚类方法，还有一些基于图的方法，其性能好坏都主要有样本之间的相似度量方法的选择决定。度量学习通常的目标是使同类样本之间的距离尽可能缩小，不同类样本之间的距离尽可能放大。

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【干货书】深度学习Pytorch快速入门，150页pdf，Deep Learning with PyTorch

专知会员服务

156+阅读 · 2021年4月4日

「深度图像检索: 2012到2020」大综述论文，21页pdf

专知会员服务

43+阅读 · 2021年1月30日

元学习(meta learning) 最新进展综述论文

专知会员服务

281+阅读 · 2020年5月8日