通过启动小组努力改进和可解释的深层计量学习 (Towards Improved and Interpretable Deep Metric Learning via Attentive Grouping)

Grouping has been commonly used in deep metric learning for computing diverse features. However, current methods are prone to overfitting and lack interpretability. In this work, we propose an improved and interpretable grouping method to be integrated flexibly with any metric learning framework. Our method is based on the attention mechanism with a learnable query for each group. The query is fully trainable and can capture group-specific information when combined with the diversity loss. An appealing property of our method is that it naturally lends itself interpretability. The attention scores between the learnable query and each spatial position can be interpreted as the importance of that position. We formally show that our proposed grouping method is invariant to spatial permutations of features. When used as a module in convolutional neural networks, our method leads to translational invariance. We conduct comprehensive experiments to evaluate our method. Our quantitative results indicate that the proposed method outperforms prior methods consistently and significantly across different datasets, evaluation metrics, base models, and loss functions. For the first time to the best of our knowledge, our interpretation results clearly demonstrate that the proposed method enables the learning of distinct and diverse features across groups. The code is available on https://github.com/XinyiXuXD/DGML-master.

翻译：用于计算不同特性的深度衡量学习通常使用分组方法,但是,目前的方法容易过于完善,而且缺乏解释性。在这项工作中,我们建议采用改进和可解释的分组方法,以灵活地与任何计量学习框架相结合。我们的方法基于关注机制,每个组都有可学习的查询。查询是完全可培训的,在与多样性损失相结合时可以捕捉特定群体的信息。我们方法的一个令人感兴趣的属性是,它自然地适合解释性能。可以学习的查询和每个空间位置之间的注意分数可以被解释为该位置的重要性。我们正式表明,我们提议的分组方法不易对特征的空间变异性。当我们作为同源神经网络的一个模块使用时,我们的方法会导致翻译性变异性。我们进行全面实验,以评价我们的方法。我们的定量结果表明,拟议的方法在不同的数据集、评价指标、基准模型和损失函数之间,始终明显地超越了先前的方法。我们所了解的第一次,我们的解释结果清楚地表明,拟议的方法能够使不同和不同的DVX/DG/MLA得到的版本。

相关内容

度量学习

关注 3372

度量学习的目的为了衡量样本之间的相近程度，而这也正是模式识别的核心问题之一。大量的机器学习方法，比如K近邻、支持向量机、径向基函数网络等分类方法以及K-means聚类方法，还有一些基于图的方法，其性能好坏都主要有样本之间的相似度量方法的选择决定。度量学习通常的目标是使同类样本之间的距离尽可能缩小，不同类样本之间的距离尽可能放大。

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【万字长文】注意力机制可解释大论述

专知会员服务

55+阅读 · 2020年11月17日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日

【ICLR2020】面向层次重要性属性:神经序列模型的组成语义解释（Towards Hierarchical Importance Attribution:explaining compositional semantics for Neural Sequence Models）

专知会员服务

10+阅读 · 2019年12月24日