Learning a disentangled, interpretable, and structured latent representation in 3D generative models of faces and bodies is still an open problem. The problem is particularly acute when control over identity features is required. In this paper, we propose an intuitive yet effective self-supervised approach to train a 3D shape variational autoencoder (VAE) which encourages a disentangled latent representation of identity features. Curating the mini-batch generation by swapping arbitrary features across different shapes allows to define a loss function leveraging known differences and similarities in the latent representations. Experimental results conducted on 3D meshes show that state-of-the-art methods for latent disentanglement are not able to disentangle identity features of faces and bodies. Our proposed method properly decouples the generation of such features while maintaining good representation and reconstruction capabilities.
翻译:在三维形形变异自动编码器(VAE)中,我们提出一种直观而有效的自我监督方法来训练三维形形变异自动编码器(VAE),这鼓励了身份特征的分解潜在代表。通过将任意特征转换为不同形状来缩小微型批量生成,可以确定一种损失函数,利用潜在表现形式中已知的差异和相似之处。在三维模形上进行的实验结果表明,潜在的解密方法无法解开面部和身体的特征。我们提出的方法适当地分离了这些特征的生成,同时保持了良好的代表性和重建能力。