Deep metric learning aims to learn an embedding space where the distance between data reflects their class equivalence, even when their classes are unseen during training. However, the limited number of classes available in training precludes generalization of the learned embedding space. Motivated by this, we introduce a new data augmentation approach that synthesizes novel classes and their embedding vectors. Our approach can provide rich semantic information to an embedding model and improve its generalization by augmenting training data with novel classes unavailable in the original data. We implement this idea by learning and exploiting a conditional generative model, which, given a class label and a noise, produces a random embedding vector of the class. Our proposed generator allows the loss to use richer class relations by augmenting realistic and diverse classes, resulting in better generalization to unseen samples. Experimental results on public benchmark datasets demonstrate that our method clearly enhances the performance of proxy-based losses.
翻译:深度计量学习的目的是学习一个嵌入空间,让数据之间的距离反映它们的类等,即使它们的班级在培训期间是看不见的。然而,由于培训中的班级数量有限,无法对学到的嵌入空间进行概括化。为此,我们引入了一种新的数据增强方法,将新类及其嵌入矢量结合起来。我们的方法可以为嵌入模型提供丰富的语义信息,并通过增加原始数据中无法提供的新类的培训数据来改进它的概括化。我们通过学习和利用一个有条件的基因化模型来落实这一理念,该模型以类标签和噪音为标志,产生随机嵌入矢量。我们提议的生成器允许损失使用更丰富的类关系,通过增加现实和多样化的班级,从而更好地向看不见的样本提供更普通化。关于公共基准数据集的实验结果表明,我们的方法明显提高了代用损失的性能。