The goal of metric learning is to learn a function that maps samples to a lower-dimensional space where similar samples lie closer than dissimilar ones. In the case of deep metric learning, the mapping is performed by training a neural network. Most approaches rely on losses that only take the relations between pairs or triplets of samples into account, which either belong to the same class or to two different classes. However, these approaches do not explore the embedding space in its entirety. To this end, we propose an approach based on message passing networks that takes into account all the relations in a mini-batch. We refine embedding vectors by exchanging messages among all samples in a given batch allowing the training process to be aware of the overall structure. Since not all samples are equally important to predict a decision boundary, we use dot-product self-attention during message passing to allow samples to weight the importance of each neighbor accordingly. We achieve state-of-the-art results on clustering and image retrieval on the CUB-200-2011, Cars196, Stanford Online Products, and In-Shop Clothes datasets.
翻译:衡量学习的目的是学习一种功能,将样本映射到类似样本比不同样本相近的低维空间。在深度测量学习中,通过培训神经网络进行测绘。大多数方法都依赖于只考虑对夫妇或三胞胎样本之间关系的损失,这些损失属于同一类别或两个不同类别。然而,这些方法并不探索整个嵌入空间。为此,我们建议一种基于信息传递网络的方法,该方法考虑到小型批量中的所有关系。我们通过在特定批次中交换信息,使培训过程能够了解整体结构,改进嵌入矢量。由于并非所有样本对于预测决定边界同样重要,我们在传递信息时都使用多产品自省,以便样本能够相应地衡量每个邻居的重要性。我们在CUB-200-2011、Carss196、斯坦福在线产品和内部-Shopp Clothes数据集上实现集群和图像检索的最新结果。