Existing deep metric learning approaches fall into three general categories: contrastive learning, average precision (AP) maximization, and classification. We propose a novel alternative approach, \emph{contextual similarity optimization}, inspired by work in unsupervised metric learning. Contextual similarity is a discrete similarity measure based on relationships between neighborhood sets, and is widely used in the unsupervised setting as pseudo-supervision. Inspired by this success, we propose a framework which optimizes \emph{a combination of contextual and cosine similarities}. Contextual similarity calculation involves several non-differentiable operations, including the heaviside function and intersection of sets. We show how to circumvent non-differentiability to explicitly optimize contextual similarity, and we further incorporate appropriate similarity regularization to yield our novel metric learning loss. The resulting loss function achieves state-of-the-art Recall @ 1 accuracy on standard supervised image retrieval benchmarks when combined with the standard contrastive loss. Code is released here: \url{https://github.com/Chris210634/metric-learning-using-contextual-similarity}
翻译:现有的深层次衡量学习方法分为三大类:对比式学习、平均精确度(AP)最大化和分类。我们建议一种创新的替代方法,即 emph{cextal相似性优化 。根据未经监督的衡量学习方法,背景相似性是一种离散的相似性衡量标准,以邻里各组之间的关系为基础,在未经监督的环境中被广泛用作伪监督的视野。受这一成功启发,我们建议了一个框架,优化背景和共性相似性之间的结合。背景相似性计算涉及若干非区别的操作,包括各组的重置功能和交叉。我们展示了如何绕过非区别性,以明确优化背景相似性,我们进一步纳入了适当的相似性规范,以产生我们新的衡量学习损失。由此产生的损失功能在与标准对比性损失相结合时,在标准监督图像检索基准上达到了“状态-艺术回调”的精确度。这里发布代码:{https://github.com/Crisitual634-leking-comlisking-comlitualty-sality}