This paper presents a deep relational metric learning (DRML) framework for image clustering and retrieval. Most existing deep metric learning methods learn an embedding space with a general objective of increasing interclass distances and decreasing intraclass distances. However, the conventional losses of metric learning usually suppress intraclass variations which might be helpful to identify samples of unseen classes. To address this problem, we propose to adaptively learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions. We further employ a relational module to capture the correlations among each feature in the ensemble and construct a graph to represent an image. We then perform relational inference on the graph to integrate the ensemble and obtain a relation-aware embedding to measure the similarities. Extensive experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
翻译:本文介绍了一个用于图像组合和检索的深层关系计量学习框架(DRML) 。大多数现有的深层计量学习方法学习嵌入空间,其总体目标是增加各阶层之间的距离和缩小各阶层之间的距离。然而,常规的计量学习损失通常抑制了各阶层内部的差异,而这种差异可能有助于确定各种隐性阶级的样本。为了解决这一问题,我们建议适应性地学习一系列特征,这些特征从不同方面呈现出不同图像的特点,以模拟各阶层之间和内部的分布。我们进一步使用一个关联模块来捕捉各特征之间的关联,并构建一个图形来代表一个图像。我们随后在图表上进行关联性推论,以整合该元素,并获得一种关联性嵌入,以测量相似性。关于广泛使用的 CUB-200-2011、Carss196和斯坦福在线产品数据集的广泛实验表明,我们的框架改进了现有的深层次的计量学习方法,并取得了非常具有竞争力的成果。