We present a principled approach to incorporating labels in VAEs that captures the rich characteristic information associated with those labels. While prior work has typically conflated these by learning latent variables that directly correspond to label values, we argue this is contrary to the intended effect of supervision in VAEs-capturing rich label characteristics with the latents. For example, we may want to capture the characteristics of a face that make it look young, rather than just the age of the person. To this end, we develop the CCVAE, a novel VAE model and concomitant variational objective which captures label characteristics explicitly in the latent space, eschewing direct correspondences between label values and latents. Through judicious structuring of mappings between such characteristic latents and labels, we show that the CCVAE can effectively learn meaningful representations of the characteristics of interest across a variety of supervision schemes. In particular, we show that the CCVAE allows for more effective and more general interventions to be performed, such as smooth traversals within the characteristics for a given label, diverse conditional generation, and transferring characteristics across datapoints.
翻译:我们提出一种原则性办法,将标签纳入VAE, 捕捉与这些标签相关的丰富特征信息。虽然先前的工作通常通过学习直接与标签价值相对应的潜在变量而将这些特征混为一谈,但我们认为这与VAES采集富含标签特征并带有潜值的监督的预期效果相悖。例如,我们可能想要捕捉使标签看起来年轻,而不仅仅是人的年龄的面孔特征。为此,我们开发了CCVAE, 这是一种新型VAE模型,以及随之而来的变异目标,它明确反映了潜在空间的标签特征,避免了标签价值和潜值之间的直接对应。我们通过明智地安排这些特征潜值和标签之间的绘图,我们表明CCVAE能够有效地了解各种监督计划对利益特征的有意义的描述。我们特别表明CCVAE允许采取更有效和更全面的干预措施,例如,在特定标签的特性范围内进行平稳的穿行、多种有条件的生成和跨数据点的特征转移。