Deep metric learning has been demonstrated to be highly effective in learning semantic representation and encoding information that can be used to measure data similarity, by relying on the embedding learned from metric learning. At the same time, variational autoencoder (VAE) has widely been used to approximate inference and proved to have a good performance for directed probabilistic models. However, for traditional VAE, the data label or feature information are intractable. Similarly, traditional representation learning approaches fail to represent many salient aspects of the data. In this project, we propose a novel integrated framework to learn latent embedding in VAE by incorporating deep metric learning. The features are learned by optimizing a triplet loss on the mean vectors of VAE in conjunction with standard evidence lower bound (ELBO) of VAE. This approach, which we call Triplet based Variational Autoencoder (TVAE), allows us to capture more fine-grained information in the latent embedding. Our model is tested on MNIST data set and achieves a high triplet accuracy of 95.60% while the traditional VAE (Kingma & Welling, 2013) achieves triplet accuracy of 75.08%.
翻译:实践证明,深入的计量学习在学习可用于测量数据相似性的数据的语义表达和编码信息方面非常有效,通过依靠从计量学习中学习的嵌入,可以用来测量数据相似性。与此同时,变式自动编码器(VAE)被广泛用于近似推论,并被证明具有直接概率模型的良好性能。然而,对于传统的VAE来说,数据标签或特征信息是难以操作的。同样,传统的代议学习方法未能代表数据的许多显著方面。在这个项目中,我们提议了一个新的综合框架,通过纳入深度指标学习,学习VAE中隐含的隐性嵌入。这些特征是通过优化VAE中平均载体的三重损失以及VAE标准下限(ELBO)来学习的。我们称之为Triplet基于Variation自动编码器(TVAEE)的这一方法,使我们能够在隐性嵌入中捕捉到更精细的信息。我们的模型在MNIST数据集上进行了测试,并实现了95.60%的高三重精度精确度精确度,而传统的VAE(Kmaing & Welling)则实现了75-8%的精确度(VAE,2013)。