Interest in understanding and factorizing embedding spaces learned by deep encoders is growing. Concept discovery methods search the embedding spaces for interpretable latent components like object shape or color and disentangle them into individual axes in the embedding space. Yet, the applicability of modern disentanglement learning techniques or independent component analysis (ICA) is limited when it comes to vision tasks: They either require training a model of the complex image-generating process or their rigid stochastic independence assumptions on the component distribution are violated in practice. In this work, we identify components in encoder embedding spaces without distributional assumptions and without training a generator. Instead, we utilize functional compositionality properties of image-generating processes. We derive two novel post-hoc component discovery methods and prove theoretical identifiability guarantees. We study them in realistic visual disentanglement tasks with correlated components and violated functional assumptions. Our approaches stably maintain superior performance against 300+ state-of-the-art disentanglement and component analysis models.
翻译:深编码器所学的嵌入空间的理解和因素化兴趣正在增长。 概念发现方法正在寻找嵌入空间以查找可解释的潜在组成部分, 如物体形状或颜色, 并将其分解为嵌入空间中的单个轴。 然而, 现代分解学习技术或独立部件分析(ICA)的适用性在愿景任务方面是有限的: 它们要么需要培训复杂的图像生成过程模型, 要么它们对于组件分布的僵硬随机独立假设在实践中遭到违反。 在这项工作中, 我们发现在未进行分布假设和未培训生成器的嵌入空间中的组件。 相反, 我们利用了图像生成过程的功能组合性特性。 我们开发了两种新颖的组合式部件发现方法, 并证明理论上的识别性保证。 我们用现实的视觉分解任务来研究它们与相关组成部分和被违反的功能假设。 我们的方法在300+状态的分解和组件分析模型上保持了较高的性能。