We focus on controllable disentangled representation learning (C-Dis-RL), where users can control the partition of the disentangled latent space to factorize dataset attributes (concepts) for downstream tasks. Two general problems remain under-explored in current methods: (1) They lack comprehensive disentanglement constraints, especially missing the minimization of mutual information between different attributes across latent and observation domains. (2) They lack convexity constraints in disentangled latent space, which is important for meaningfully manipulating specific attributes for downstream tasks. To encourage both comprehensive C-Dis-RL and convexity simultaneously, we propose a simple yet efficient method: Controllable Interpolation Regularization (CIR), which creates a positive loop where the disentanglement and convexity can help each other. Specifically, we conduct controlled interpolation in latent space during training and 'reuse' the encoder to help form a 'perfect disentanglement' regularization. In that case, (a) disentanglement loss implicitly enlarges the potential 'understandable' distribution to encourage convexity; (b) convexity can in turn improve robust and precise disentanglement. CIR is a general module and we merge CIR with three different algorithms: ELEGANT, I2I-Dis, and GZS-Net to show the compatibility and effectiveness. Qualitative and quantitative experiments show improvement in C-Dis-RL and latent convexity by CIR. This further improves downstream tasks: controllable image synthesis, cross-modality image translation and zero-shot synthesis. More experiments demonstrate CIR can also improve other downstream tasks, such as new attribute value mining, data augmentation, and eliminating bias for fairness.
翻译:我们的焦点是可控的分解代表学习(C- Dis- RL), 用户可以同时控制分解的隐性空间, 以将数据集属性( 概念) 纳入下游任务。 在目前的方法中, 有两个一般性问题仍未得到充分探讨:(1) 缺乏全面的分解限制, 特别是没有将不同隐形和观察域的不同属性之间的相互信息最小化。 (2) 在分解的隐性空间中缺乏共解性限制, 这对有意义地调控下游任务的具体属性非常重要。 为了同时鼓励综合的 C- Dis- RL 和共性, 我们建议了一个简单而有效的方法: 可控的内插( CIR), 创造出一个积极的循环, 分解和共解 。 具体地, 我们在潜在空间中进行有控制的互调, “ 使用” 编码来帮助形成“ 无效的分解性” 。 在这种情况下, 分解性(a) 分解性损失也暗地放大潜在的“ 无法理解性” 分布, 以鼓励交错性为 C- 和 C- diral- develyal- dal- dal- dal- disal- dal- sal- disl) 。