Identifying meaningful and independent factors of variation in a dataset is a challenging learning task frequently addressed by means of deep latent variable models. This task can be viewed as learning symmetry transformations preserving the value of a chosen property along latent dimensions. However, existing approaches exhibit severe drawbacks in enforcing the invariance property in the latent space. We address these shortcomings with a novel approach to cycle consistency. Our method involves two separate latent subspaces for the target property and the remaining input information, respectively. In order to enforce invariance as well as sparsity in the latent space, we incorporate semantic knowledge by using cycle consistency constraints relying on property side information. The proposed method is based on the deep information bottleneck and, in contrast to other approaches, allows using continuous target properties and provides inherent model selection capabilities. We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which lead to sparser and more interpretable models with improved invariance properties.
翻译:查明数据集中有意义和独立的变异因素是一项艰巨的学习任务,经常通过深潜潜伏变量模型加以解决。这一任务可视为学习对称转换,在潜伏维度上保留选定财产的价值。但是,现有方法在加强潜伏空间的变异性方面显示出严重的缺点。我们用一种新颖的周期一致性方法解决这些缺点。我们的方法分别涉及目标财产和其余输入信息的两个不同的潜伏子空间。为了加强潜在空间的变异性和宽度,我们通过利用依赖财产侧信息的周期一致性限制,纳入了语义知识。拟议方法以深信息瓶颈为基础,与其他方法不同,允许使用连续的目标属性并提供固有的模型选择能力。我们用合成和分子数据来证明我们的方法确定了更有意义的因素,这些因素导致随着变异性特性的改善而使模型更加稀少和解释性更强。